Something About Bipartite Graph

In this article, I will mention some concepts in graph theory. They usually appear in problems related to bipartite graph. To solve such problems, we need to know these things and some conclusions.

Let’s denote the graph $ G = (V, E) $, where $V$ is the set of vertices and $E$ is the set of edges in graph $G$.

  1. Match: A set of edges $ M \subseteq E $, where any two edges in $M$ do not have common endpoints.
  2. Edge Cover: A set of edges $ F \subseteq E $, where any vertex in $V$ is an endpoint of some edge in $F$.
  3. Independent Set: A set of vertices $ S \subseteq V $, where any vertices in $S$ are not connected with each other.
  4. Vertex Cover: A set of vertices $ S \subseteq V $, where each edge in $E$ has at least one endpoint in $S$ .

Now let’s see some important conclusions which help us solve problems about bipartite graph.
Conclusion (1)
$ | Maximum \ Match | + |Minimum \ Edge \ Cover | = |V| $.
Suppose there are $X$ edges in the Maximum Match, $2X$ vertices are covered by these edges. And Suppose there are $Y$ vertices remain uncovered by any edge. To get the Minimum Edge Cover, we can pick the $X$ edges in the Maximum Match and take extra $Y$ edges which connect all the remain vertices. So $ |Minimum \ Edge \ Cover| = X + Y $, as we defined, $ 2X + Y = |V| $.
Conclusion (2)
$ | Maximum \ Independent \ Set | + | Minimum \ Vertex \ Cover | = |V| $.
Let’s denote $V_{1}$ as the Maximum Independent Set. $V_{2} = V - V_{1}$. Since there are no edges between any two vertices in $V_{1}$ and such set is maximized, $V_{2}$ covers all the edges and $V_{2}$ is minimized. This means $V_{2}$ is the Minimum Vertex Cover.
Conclusion(3)
In a bipartite graph, $ |Maximum \ Match| = |Minimum \ Vertex \ Cover| $.

Actually, to get a Minimum Vertex Cover or a Maximum Independent Set is a NP-Hard problem. But in a bipartite graph, we get can them by using the above conclusions in linear time complexity.

Fractional programming & Dinkelbach theorem

Let’s first consider such a problem: Given n objects, the i-th of them has value $v_i$ and has weight $w_i$. Now we want to pick some of them, say, the set of objects we pick is $S$. And we want to make $ f(S) = \frac{\sum_{i \in S} v_i}{\sum_{i \in S} w_i} $ maximized. How do we calculate that?
This is actually a fundamental example of fractional programming. Here is normal form of fractional programming:
$$ Minimize \ \ \lambda = f(x) = \frac{a(x)}{b(x)} \ (x \in S) $$
$x$ here is just an input among all the inputs in the whole space $S$, it can be anything, perhaps not only a simple variable. We can solve the problem by passing all $x$ from $S$ and get the result. But the size of $S$ could be usually very large or even infinite. So such solution is impractical.

Then we calls for the Dinkelbach theorem. Now let’s introduce a new function:
$$ g(\lambda) = \min_{x \in S}{ (a(x) - \lambda b(x)) } $$
Suppose $\lambda^{*}$ is the optimized value. The Dinkelbach algorithm says that:

  1. $ g(\lambda) = 0, \lambda = \lambda^{*} $
  2. $ g(\lambda) < 0, \lambda > \lambda^{*} $
  3. $ g(\lambda) > 0, \lambda < \lambda^{*} $

I won’t give detailed proof here, it’s actually easy to understand. $ g(\lambda) < 0 $ means that there exists some $x$ that satisfy $ a(x) - \lambda b(x) = 0$. So we can make $\lambda$ smaller. And $ g(\lambda) > 0 $ means that there is no such $x$, so this $\lambda$ is too small.
With this, we can get the answer by binary search. We can fix a $\lambda$ and calculate the value of $g(\lambda)$ and finally approach the answer. Dinkelbach also works when we want to get the maximized value instead of the minimized one. But this time the function changes in this way:
$$ g(\lambda) = \max_{ x \in S }{(a(x) - \lambda b(x))} $$

Let’s solve a sample problem of fractional programming: poj 3111 - K Best
My code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#include <stdio.h>
#include <iostream>
#include <algorithm>
#include <string.h>
#include <vector>
#include <math.h>

using namespace std;

#define pb push_back
#define mp make_pair
#define ll long long
#define ull unsigned ll
#define db double
#define INF 0x3f3f3f3f
#define MOD 1000000007
#define PII pair<int, int>

const int N=100010;
int n,k;
int v[N],w[N];
struct node {
db val;
int idx;
node(db a,int b):val(a),idx(b){}
bool operator<(const node& rhs) const {
return val>rhs.val;
}
};
vector<node> vec;

bool check(db val) {
vec.clear();
for (int i=1;i<=n;i++) {
vec.pb(node(v[i]-val*w[i],i));
}
sort(vec.begin(),vec.end());
db tot=0.0;
for (int i=0;i<k;i++) tot+=vec[i].val;
return tot>0.0;
}

int main()
{
scanf("%d%d",&n,&k);
for (int i=1;i<=n;i++) {
scanf("%d%d",v+i,w+i);
}
db lo=0.0,hi=1e11;
for (int i=0;i<50;i++) {
db mid=(lo+hi)*0.5;
if (check(mid)) lo=mid;
else hi=mid;
}
for (int i=0;i<k;i++) {
printf("%d ",vec[i].idx);
}
printf("\n");
}

This is almost an direct application of Dinkelbach. When we fix $\lambda$, we can get $g(\lambda)$ by sorting all items of $ v_i - \lambda w_i $, and sum k biggest items up. And finally we got the answer.

Connectivity & Tarjan Algorithm

Well, I think Tarjan algorithm is quite basic in graph theory. It’s only an simple application of dfs. Inspite of this, I usually refer to the internet or past code of it when I need to use it. So I try to figure out how it works in this blog and hope that I can learn it by heart.

As we all know, the Tarjan algorithm enables us to find all the articulation points in $O(|V| + |E|)$ time complexity. Actually it’s famous for find strongly connected components in DAG(directional acyclic graph). But here we only focus on using it to find articulation points in undirectional graph.

Articulation point, also called cut vertex, is any vertex whose removal increases the number of components.

Then we talk about how Tarjan algorithm works. It actually runs a depth-first-search and during the process it assigns two numbers to each vertex. Let’s call them dfn and low. dfn is the order in which the vertex is visited during the whole dfs process. And low is the smallest dfn of any vertex that can be visited by current vertex in the depth-first-search tree. So when we found two vertices where $low[v] \ge dfn[u]$, that means we have to visit u before reaching v in the dfs tree. As a result, u is a articulation point. But be careful that there is a special case: u is the root vertex in the dfs tree. Then for any vertex v the above condition holds. A root would be a cut vertex when it has more than one sub trees in dfs tree. Since its sub trees are not connected apparently, cutting the root will make the sub trees seperated. So this is the main idea.

Now let’s see how to calculate dfn and low and implement the algorithm. dfn is quite easy to get, we just maintain a counter and increase it by one in each step of the dfs and assign it to the current vertex.
For low, there are three cases:

  1. $low[u] = dfn[u]$, this is the base case.
  2. $low[u] = dfn[v]$, this happens when Edge(u,v) is a back edge.
  3. $low[u] = low[v]$, this happens when Edge(u,v) is a forward edge.

Here is the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
const int N=505,M=50000;
struct edge {
int to,next;
} e[M];
int head[N],tot;
void add_edge(int u,int v) {
e[tot].to=v;
e[tot].next=head[u];
head[u]=tot++;
}
int counter=0;
int dfn[N],low[N];
bool found=false;
int n,m;

void dfs(int u,int pa) {
dfn[u]=low[u]=++counter;
int son=0;
for (int i=head[u];i!=-1;i=e[i].next) {
int to=e[i].to;
if (!dfn[to]) {
dfs(to,u);
if (pa!=-1&&dfn[u]<=low[to]) {
found=true;
}
low[u]=min(low[u],low[to]);
son++;
} else if (to!=pa) {
low[u]=min(low[u],dfn[to]);
}
}
if (pa==-1&&son>1) {
found=true;
}
}

It’s just a simple dfs. When found variable is set to true, we get a cut vertex.

A graph that doesn’t contain any cut vertex is called 2-connected. Let’s go a little further. A 3-connected graph is such a graph that, after we remove any vertex from it, it’s still 2-connected. Then how to tell if a graph is 3-connected? To check if a graph is 2-connected, we can simply run the Tarjan algorithm. If we find cut vertex, then it is not 2-connected. So for checking the property of 3-connected, we can enumerate the vertices and remove it, and then run the Tarjan algorithm to find a cut vertex. If after the whole process, we can still not find a cut vertex, then the graph is 3-connected.
Here is a sample question: POJ 3713 - Transferring Sylla
And this is my code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
#include <stdio.h>
#include <iostream>
#include <algorithm>
#include <string>
#include <string.h>
#include <vector>
#include <math.h>

using namespace std;

#define pb push_back
#define mp make_pair
#define ll long long
#define ull unsigned ll
#define db double
#define INF 0x3f3f3f3f
#define MOD 1000000007
#define PII pair<int, int>

const int N=505,M=50000;
struct edge {
int to,next;
} e[M];
int head[N],tot;
void add_edge(int u,int v) {
e[tot].to=v;
e[tot].next=head[u];
head[u]=tot++;
}
int counter=0;
int dfn[N],low[N];
bool found=false;
int a[M],b[M],n,m;

void dfs(int u,int pa) {
dfn[u]=low[u]=++counter;
int son=0;
for (int i=head[u];i!=-1;i=e[i].next) {
int to=e[i].to;
if (!dfn[to]) {
dfs(to,u);
if (pa!=-1&&dfn[u]<=low[to]) {
found=true;
}
low[u]=min(low[u],low[to]);
son++;
} else if (to!=pa) {
low[u]=min(low[u],dfn[to]);
}
}
if (pa==-1&&son>1) {
found=true;
}
}

bool three_conn() {
for (int i=0;i<n;i++) {
memset(head,-1,sizeof(head));
tot=0;
found=false;
counter=0;
for (int j=0;j<m;j++) {
if (a[j]!=i&&b[j]!=i) {
add_edge(a[j],b[j]);
add_edge(b[j],a[j]);
}
}
memset(dfn,0,sizeof(dfn));
memset(low,0,sizeof(low));
if (i==0) {
dfs(1,-1);
} else {
dfs(0,-1);
}
if (counter!=n-1||found) return false;
}
return true;
}

int main()
{
while (true) {
scanf("%d%d",&n,&m);
if (n==0&&m==0) break;
for (int i=0;i<m;i++) {
scanf("%d%d",a+i,b+i);
}
if (three_conn()) printf("YES\n");
else printf("NO\n");
}
}

Minimum Cut Problems

I think these problems are difficult because they are obscure. I mean, we can hardly recognize them and adopt a minimum-cut solution, at least for me. Maybe solving a great many of these problems would help. One characteristic of these problems is that we need to divide some elements into two sets(source and sink) to get some optimal result.
Let’s see some problems:

POJ 3469 - Dual Core CPU

Problem: POJ 3469 - Dual Core CPU
There are 2 cores A, B and N tasks to be processed. The time cost for the i-th task on A, B are $a_{i}$, $b_{i}$ respectively. And for some tasks, when processed on two different cores, will exert an extra cost. For each task, decide which core should it be processed by to make the total cost minimized.
In this problem, we divide the tasks into two sets(processed on A, and on B). So we try to reduce it to a minimum-cut problem. Apparently, the minimized cost corresponds to the minimum cut.
For each task i, we add an edge from source to i whose weight equals to $b_{i}$. Then if this edge is among the cut set, task i is processed on core B. For the same reason, we add an edge from i to sink whose weight equals to $a_{i}$. Finally, for those tasks who effect each other, say i and j, we add two edges between i and j(in two directions), whose weight equals to the extra cost. Now the minumum cut of the graph is equal to the minimized cost we want to get.
Pseudo code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int main() {
for (int i=1;i<=n;i++) {
scanf("%d%d",&a,&b);
add_edge(0,i,b);
add_edge(i,0,0); // reverse edge
add_edge(i,n+1,a);
add_edge(n+1,i,0); // reverse edge
}
for (int i=1;i<=m;i++) {
scanf("%d%d%d",&a,&b,&w); // extra cost
add_edge(a,b,w);
add_edge(b,a,0); // reverse edge
add_edge(b,a,w);
add_edge(a,b,0); // reverse edge
}
ll mx_flow=max_flow(); // get the answer
}

Code Jam - World Finals 2009 - Problem D. Wi-fi Towers

Problem: Wi-fi Towers
Given N wi-fi towers, the i-th is located at $x_{i}$, $y_{i}$, and has a range $r_{i}$. Now we want to upgrade the protocol of them. The score of upgrading the i-th tower is $s_{i}$. When the i-th tower is upgraded, all towers within its range have to be upgraded as well. Decide which towers to be upgraded to make the total score maximized.
Wow, world final problem!
It’s quite difficult to associate this with minimum-cut. The two sets are towers to be upgraded and towers not to be upgraded.
We want maximum score, so we calculate the minimum lost. If $s_{i}$ is negative, the absolute value of it is a lost. We add an edge from source to i whose weight is $|s_{i}|$. If $s_{i}$ is positive, we can imagine that tower i has already been upgraded at first, and “upgrading” it to the origin exerts a lost. We add an edge from i to sink whose weight is $|s_{i}|$. We need to add the positive values to the answer, of course.
Then consider the constraints. If j has to be upgraded if i is upgraded, we add an edge from j to i, whose weight is $\infty$. As a result, if we upgrade i without upgrading j, the cut value will be infinity.
Now we reduce the minimized lost from the answer, and we get the maximum score.
Pseudo code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
int main() {
int src=0,sink=n+1;
int ans=0;
for (int i=1;i<=n;i++) {
if (s[i]<0) {
add_edge(0,i,-s[i]);
add_edge(i,0,0); // reverse edge
} else {
add_edge(i,n+1,s[i]);
add_edge(n+1,i,0); // reverse edge
ans+=s[i];
}
for (int j=1;j<=n;j++) {
if (i==j) continue;
if (sqr(x[i]-x[j])+sqr(y[i]-y[j])<=sqr(r[i])) { // within the range
add_edge(j,i,INF);
add_edge(i,j,0); // reverse edge
}
}
}
int mx_flow=max_flow();
ans-=mx_flow;
}

Hope I will recognize the minimum-cut problem when I meet one next time…

Codeforces 992E - Nastya and King-Shamans

Problem: Codeforces 992E Nastya and King-Shamans
Given n intergers $ a_{1},a_{2},…a_{n} $ and q operations $(p_{i}, x_{i})$, $1 \le n,q \le 2 \cdot 10^{5} $. In each operation, change $a_{p_{i}}$ to $x_{i}$, and after that give any index k such that $ a_{k} = \sum_{i=1}^{k-1} a_{i} $ if there exists one.

Segment tree solution

For each query, we search the answer from left to right. We maintain a cur_sum, which is the sum of $a_{i}$ we have passed, meaning that we haven’t found a answer till now. So we go on to find it in the right part. We need to find an leftmost element that is at least cur_sum. If we find one, then we are done. If we don’t find it, we add that element to cur_sum. As it is the leftmost one, we wouldn’t omit any answer. And since the cur_sum will be at least twice as big as it was, the count repeated search for one operation will be at most log(MAX).
We can use segment tree to maintain the sum and max value to do the queries.
My code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
#include <iostream>
#include <algorithm>
#include <string.h>
#include <math.h>

using namespace std;

#define pb push_back
#define mp make_pair
#define ll long long
#define ull unsigned ll
#define db double
#define INF 0x3f3f3f3f
#define MOD 1000000007
#define PII pair<int, int>

const int N=200010;
ll s[N*4],mx[N*4],a[N];
int n,q,p,x;

void build(int x,int l,int r) {
if (l==r) {
mx[x]=s[x]=a[l];
return;
}
int m=(l+r)>>1;
build(2*x,l,m);
build(2*x+1,m+1,r);
mx[x]=max(mx[2*x],mx[2*x+1]);
s[x]=s[2*x]+s[2*x+1];
}

void update(int x,int l,int r,int pos,ll val) {
if (l==r) {
mx[x]=s[x]=val;
return;
}
int m=(l+r)>>1;
if (pos<=m) update(2*x,l,m,pos,val);
else update(2*x+1,m+1,r,pos,val);
mx[x]=max(mx[2*x],mx[2*x+1]);
s[x]=s[2*x]+s[2*x+1];
}

ll get_sum(int x,int l,int r,int lq,int rq) {
if (lq<=l&&r<=rq) return s[x];
if (lq>r||rq<l) return 0;
int m=(l+r)>>1;
return get_sum(2*x,l,m,lq,rq)+get_sum(2*x+1,m+1,r,lq,rq);
}

int down(int x,int l,int r,ll val) {
if (l==r) {
if (mx[x]>=val) return l;
else return n+1;
}
int m=(l+r)>>1;
if (mx[2*x]>=val) return down(2*x,l,m,val);
return down(2*x+1,m+1,r,val);
}

int get_left(int x,int l,int r,int lq,int rq,ll val) {
if (l>rq||lq>r) return n+1;
if (lq<=l&&r<=rq) {
if (mx[x]<val) return n+1;
return down(x,l,r,val);
}
int m=(l+r)>>1;
int res=get_left(2*x,l,m,lq,rq,val);
if (res!=n+1) return res;
return get_left(2*x+1,m+1,r,lq,rq,val);
}

int solve() {
int idx=0;
ll sum=0;
while (idx<n+1) {
int new_idx=get_left(1,1,n,idx+1,n,sum);
if (new_idx==n+1) return -1;
ll pre_sum=get_sum(1,1,n,1,new_idx-1);
ll ele=get_sum(1,1,n,new_idx,new_idx);
if (pre_sum==ele) return new_idx;
idx=new_idx;
sum=pre_sum+ele;
}
return -1;
}

int main()
{
scanf("%d%d",&n,&q);
for (int i=1;i<=n;i++) scanf("%lld",a+i);
build(1,1,n);
for (int i=1;i<=q;i++) {
scanf("%d%d",&p,&x);
update(1,1,n,p,x);
printf("%d\n",solve());
}
}

Square decomposition

Let $p_{i}$ be the prefix sum of a, and $f_{i} = p_{i} - 2 \cdot a_{i}$. Then we only need to find an i such that $f_{i} = 0$. And each time we change $a_{i}$ to val, let $\delta = val - a_{i}$, $f_{i}$ should decrease by $\delta$ whereas $f_{k}(k>i)$ should increase by $\delta$.
Next we divide the sequence of groups of size M. For each change, we may change at most N\M groups as a whole and deal with all the elements in one group. And for query, we keep all elements in a group in order so that we can use binary search. To do with the single group which is not entirely coverd by the range, we can do something like merging sorted vectors.
Though the solution is still too slow to pass the tests, I think square decomposition is powerful in data structrue problems.
My code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#include <iostream>
#include <algorithm>
#include <string.h>
#include <math.h>
#include <vector>

using namespace std;

#define pb push_back
#define mp make_pair
#define ll long long
#define ull unsigned ll
#define db double
#define INF 0x3f3f3f3f
#define MOD 1000000007
#define PII pair<int, int>

const int N=200010;
int M=500;
int n,p,q,x,a[N],s[N];
struct node {
ll val;
int idx;
bool operator<(const node& rhs) const {
return val<rhs.val;
}
bool operator==(const node& rhs) const {
return val==rhs.val;
}
};
vector<node> aim[1000];
ll acc[1000];

void modify_group(int idx,ll delta) {
int gid=idx/M;
vector<node> v1,v2;
ll pivot=-1;
bool added=false;
for (int i=0;i<aim[gid].size();i++) {
if (aim[gid][i].idx==idx) {
pivot=aim[gid][i].val-delta;
break;
}
}
for (int i=0;i<aim[gid].size();i++) {
if (aim[gid][i].idx<idx) {
if (!added&&pivot<=aim[gid][i].val) {
v1.pb({pivot,idx});
added=true;
}
v1.pb(aim[gid][i]);
} else if (aim[gid][i].idx>idx) {
v2.pb(aim[gid][i]);
v2.back().val+=delta;
}
}
if (!added) { v1.pb({pivot,idx}); }
aim[gid].clear();
merge(v1.begin(),v1.end(),v2.begin(),v2.end(),back_inserter(aim[gid]));
}

int main()
{
scanf("%d%d",&n,&q);
M=sqrt(n)+1;
for (int i=0;i<n;i++) {
scanf("%d",a+i);
s[i]=a[i];
if (i>0) s[i]+=s[i-1];
}
for (int i=0;i<n;i++) {
aim[i/M].pb({s[i]-2*a[i],i});
}
int g=n/M+1;
for (int i=0;i<g;i++) sort(aim[i].begin(),aim[i].end());
for (int i=0;i<q;i++) {
scanf("%d%d",&p,&x);
p--;
int id=p/M+1;
ll delta=x-a[p];
a[p]=x;
for (int j=id;j<g;j++) acc[j]+=delta;
modify_group(p,delta);
bool found=false;
for (int j=0;j<g;j++) {
node nd={-acc[j],-1};
auto iter=lower_bound(aim[j].begin(),aim[j].end(),nd);
if (iter!=aim[j].end()&&(*iter).val==-acc[j]) {
printf("%d\n",(*iter).idx+1);
found=true;
break;
}
}
if (!found) printf("-1\n");
}
}

Codeforces 995C Leaving the Bar

Problem: Codeforces 995C Leaving the Bar
Given n two-dimensional vectors $\vec v_{i}$ and n intergers $ c_{1},c_{2},…c_{n} $. It’s guaranteed that $ |v| \le 1e6 $ and $ c_{i} = 1 or -1 $. Let $\vec p = \sum_{i=1}^{n} c_{i}\vec v_{i}$. Determine the value of $c_{i}$ to satisfy $|p| \le 1.5 \cdot 1e6$.

As n is at most $10^{5}$, it is impossible to enumerate all the cases.
The key to solve this problem is such a conclusion:
Given 3 vectors $\vec v$, $|v| \le R$, we can always get some $\vec v_{i} + \vec v_{j}$ or $\vec v_{i} - \vec v_{j}$ that has length at least R.
With the opposite vectors of the 3 vectors, there are 6 vectors altogether now. So if we divide the plane into 6 60-degree sectors, there is at least one sector that contains two vectors(they are not the opposite for sure). Pretty good.

So we can merge vectors until there are only 2 vectors.
We always pick 3 vectors and merge 2 of them if the length of the result vector is less equal than 1e6.
And we can ensure that length of the sum or difference of the last 2 vectors is less equal than $\sqrt{2} \cdot 1e6 \le 1.5 \cdot 1e6$.

My Code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
#include <iostream>
#include <algorithm>
#include <string.h>
#include <math.h>
#include <queue>

using namespace std;

#define pb push_back
#define mp make_pair
#define ll long long
#define ull unsigned ll
#define db double
#define INF 0x3f3f3f3f
#define MOD 1000000007
#define PII pair<int, int>

const int N=100010;
const int R=1000000;
const db eps=1e-8;

struct P {
db x,y;
P(){}
P(db xx,db yy):x(xx),y(yy){}
P operator+(P rhs) {
return P(x+rhs.x,y+rhs.y);
}
P operator-(P rhs) {
return P(x-rhs.x,y-rhs.y);
}
P operator*(db k) {
return P(k*x,k*y);
}
db dot(P rhs) {
return x*rhs.x+y*rhs.y;
}
db det(P rhs) {
return x*rhs.y-y*rhs.x;
}
};

struct node {
int ls,rs,type;
};

int n,tot=0;
P vec[2*N];
node nd[2*N];
int ans[N];

void dfs(int id,int type) {
if (nd[id].ls==-1) {
ans[id]=type;
return;
}
dfs(nd[id].ls,type);
dfs(nd[id].rs,type*nd[id].type);
}

int main()
{
scanf("%d",&n);
queue<int> q;
for (int i=0;i<n;i++) {
scanf("%lf%lf",&vec[tot].x,&vec[tot].y);
nd[tot++]={-1,-1,1};
q.push(i);
}
while (q.size()>2) {
int a[3]={-1};
for (int i=0;i<3;i++) {
a[i]=q.front();
q.pop();
}
int x=-1,y=-1,type=1;
for (int i=0;i<3&&x==-1;i++) {
for (int j=i+1;j<3&&x==-1;j++) {
if (sqrt((vec[a[i]]+vec[a[j]]).dot(vec[a[i]]+vec[a[j]]))<R+eps) {
x=a[i],y=a[j];
break;
}
if (sqrt((vec[a[i]]-vec[a[j]]).dot(vec[a[i]]-vec[a[j]]))<R+eps) {
x=a[i],y=a[j],type=-1;
break;
}
}
}
for (int i=0;i<3;i++) {
if (a[i]!=x&&a[i]!=y) q.push(a[i]);
}
nd[tot]={x,y,type};
vec[tot]=vec[x]+(vec[y]*type);
q.push(tot++);
}
if (q.size()<2) {
printf("1\n");
return 0;
}
int x=q.front();q.pop();
int y=q.front();q.pop();
if (sqrt((vec[x]+vec[y]).dot(vec[x]+vec[y]))<sqrt(2)*R+eps) {
nd[tot++]={x,y,1};
} else {
nd[tot++]={x,y,-1};
}
dfs(tot-1,1);
for (int i=0;i<n;i++) printf("%d ",ans[i]);
printf("\n");
}

Subsets Property Merge

Given such a problem:
There is a sequence of length N: $ A_{0} $, $ A_{1} $,…, $ A_{N} $. For every integer K satisfying $ 1 \le K \le N $, find the maximum value of $A_{i}+A_{j}$ where $i\subseteq K$ and $j \subseteq K$. Here, $a \subseteq b$ means that, in binary representation, the set of positions of a is a subset of that of b.
This problem is actually a subproblem of ARC 100 E. The solution to such problem is very similar to dynamic programming and I think it could be used in many cases.
Code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

#include <iostream>
#include <algorithm>
#include <string.h>
#include <math.h>
#include <vector>

using namespace std;

#define pb push_back
#define mp make_pair
#define ll long long
#define ull unsigned ll
#define db double
#define INF 0x3f3f3f3f
#define MOD 1000000007
#define PII pair<int, int>

const int N=1<<18;
int a[N];
PII mx[N];
int n; // n is the count of digits

bool cmp(int i,int j) {
return a[i]<a[j];
}

void solve() {
vector<int> cur;
for (int i=0;i<(1<<n);i++) {
cur.clear();
cur.pb(i);
for (int j=0;j<n;j++) {
if ((i>>j)&1) {
int x=i^(1<<j);
cur.pb(mx[x].first);
if (mx[i].second!=-1) cur.pb(mx[x].second);
}
}
sort(cur.begin(),cur.end());
cur.resize(unique(cur.begin(),cur.end())-cur.begin());
if (cur.size()==1) {
mx[i]=mp(cur[0],-1);
} else {
mx[i]=mp(cur[0],cur[1]);
}
}
}


The two elements of mx[k] are the two largest elements satisfying $i \subseteq k$ and $j \subseteq k$. So we can get the maximum value of $A_{i} + A_{j}$.
Here we compute the result of k from the result of its subsets(has only one bit different).

Geometry Basics

Geometry is a important part in competitive programming. However, it doesn’t appear much in small races. That’s why I’m quite unfamiliar with it. But I’m convinced that I will have to handle these problems sooner or later.

This blog is just a very start of geometry. I collect some very basic things in geometry and we only talk about two-dimensional. All the vectors mentioned below are two-dimensional.

I would introduce:

  1. Common data structure for geometry problems
  2. Dot product
  3. Cross product
  4. Are the two vectors parallel
  5. Is the point on the line or the segment
  6. Get the cross point of two lines
  7. Do the two segments touch each other

Common data structure

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

struct P {
db x,y;
P(){}
P(db xx,db yy):x(xx),y(yy){}
P operator+(P rhs) {
return P(x+rhs.x,y+rhs.y);
}
P operator-(P rhs) {
return P(x-rhs.x,y-rhs.y);
}
P operator*(db k) {
return P(k*x,k*y);
}
db dot(P rhs) {
return x*rhs.x+y*rhs.y;
}
db det(P rhs) {
return x*rhs.y-y*rhs.x;
}
};

This is a really simple struct, but it is quite powerful. It enables us to implement geometry algorithm elegantly. One of the good parts of it is that it can be taken as both a Point as well as a Two-dimensional vector.

Dot product

$$ \overrightarrow{AB} \cdot \overrightarrow{AC} = a $$
The geometric meaning of dot product is the length(maybe negative) of the projection of one vector onto the other. The result of dot product is a value.

Cross product

$$ \overrightarrow{AB} \times \overrightarrow{AC} = \overrightarrow{AD} $$
The result of cross product is actually a vector which is perpendicular to the plane which is deterimined by the two vectors. But in two dimension, we can take the result as a value which is equal to the area of the parallelogram determined by the two vectors.

Are the two vectors parallel

This is quite simple. When two vectors are parallel, the area of the parallelogram formed by them is 0. So we can use cross product to get the area of the parallelogram and tell if the two vectors are parallel.

Is the point on the line or the segment

Let’s start with line. Call the point to be judged A, and find any two points B,C on the line.
Obviously, if A is on the line, then we have:
$$ \overrightarrow{AB} \times \overrightarrow{AC} = 0 $$
Now let’s go with segment. We call the two endpoints of the segment B, C now.
If A is on the segment, A has to be on the line first. Then B and C should be on different sides of A. We can check this simply by:
$$ \overrightarrow{AB} \cdot \overrightarrow{AC} <= 0 $$

Get the cross point of two lines

We need pay attention to whether the two lines are exactly the same line and whether they are parallel at first.
Say, p1 and p2 are two points one line 1.
Then we can represent any point on line 1 as: $ p_{1} + t \cdot (p_{2} - p_{1}) $.
And what we need to do is to calculate t to make the point on line 2.
We can calculate t as :
$$
t = \frac
{(q_{2} - q_{1}) \times (q_{1} - p_{1})}
{(q_{2} - q_{1}) \times (p_{2} - p_{1})}
$$
So the cross point is:
$$
p_{1} +
\frac
{(q_{2} - q_{1}) \times (q_{1} - p_{1})}
{(q_{2} - q_{1}) \times (p_{2} - p_{1})}
\cdot
(p_{2} - p_{1})
$$
I cannot explain clearly why it works. We can simply think the proportion of area of two parallelograms.

Do the two segments touch each other

Since we have 4 and 5, we can do tell this easily. First, we get the cross point of the two lines where the two segments lie. Next, we judge if the cross point is on both segments. Really simple, right?

Max Flow Algorithm

Maximum flow is a important part of graph theory in competitive programming. There have been already thousands of articles and blogs teaching people the principle and the implementation of max flow. So I’m not going to elaborate these things again. This blog is just written to help me implement the max flow algorithm faster in a competition with less mistakes.

Ford-Fulkson

The main idea of max flow algorithm is to find augmented path until there is no one. When we find an augmented path, we need to change the residual graph. We should reduce the capcity of each edge on the path and increase it on each reverse edge on it by the flow we found. So in the implementation we don’t need to store the current flow of each edge, we just need to store the capcity of them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

const int N=20005,M=2000005;
struct edge {
int to,next,cap;
} e[N];
int head[N],tot;
void add_edge(int u,int v,int cap) {
e[tot].to=v;
e[tot].cap=cap;
e[tot].next=head[u];
head[u]=tot++;
}
bool vis[N];

int dfs(int u,int sink,int f) {
vis[u]=true;
if (u==sink) return f;
for (int i=head[u];i!=-1;i=e[i].next) {
int to=e[i].to;
if (!vis[to]&&e[i].cap>0) {
int ff=dfs(to,min(f,e[i].cap));
if (ff>0) {
e[i].cap-=cap;
e[i^1].cap+=cap;
return ff;
}
}
}
return 0;
}

int main()
{
// Get input
memset(head,-1,sizeof(head));
// Build graph here
int max_flow=0;
while (true) {
memset(vis,0,sizeof(vis));
int ff=dfs(src,sink,INF);
if (ff>0) max_flow+=ff;
else break;
}
printf("max_flow:%d\n",max_flow);
}

Ford-Fulkson is fast enough in most cases. But sometime it cannot meet our requirements. That’s why we need Dinic.

Dinic

Two optimizations are used in Dinic which make it faster. One islevel array and the other is arc optimization. I’m not going to elaborate how they work here. Different from Ford-Fulkson, it runs a bfs to build the level graph and then run dfs a couple of times(without initializing).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65

const int N=20005,M=2000005;
struct edge {
int to,next,cap;
} e[N];
int head[N],tot;
void add_edge(int u,int v,int cap) {
e[tot].to=v;
e[tot].cap=cap;
e[tot].next=head[u];
head[u]=tot++;
}
int level[N],iter[N];

void bfs(int src) {
queue<int> q;
memset(level,-1,sizeof(level));
level[src]=0;
q.push(src);
while (!q.empty()) {
int cur=q.front();
q.pop();
for (int i=head[cur];i!=-1;i=e[i].next) {
int to=e[i].to;
if (e[i].cap>0&&level[to]==-1) {
level[to]=level[cur]+1;
q.push(to);
}
}
}
}

int dfs(int u,int sink,int f) {
if (u==sink) return f;
for (int& i=iter[u];i!=-1;i=e[i].next) {
int to=e[i].to;
if (e[i].cap>0&&level[u]<level[to]) {
int ff=dfs(to,sink,min(f,e[i].cap));
if (ff>0) {
e[i].cap-=ff;
e[i^1].cap+=ff;
return ff;
}
}
}
return 0;
}

int main()
{
// Get input
memset(head,-1,sizeof(head));
// Build graph here
int mx_flow=0;
while (true) {
bfs(src);
if (level[sink]==-1) break;
for (int i=1;i<=n;i++) iter[i]=head[i];
int ff;
while ((ff=dfs(src,sink,INF)>0)) {
mx_flow+=ff;
}
}
printf("%d\n",mx_flow);
}

Codeforces 984E Elevator

The problem: Codeforces 984E Elevator
I failed to solve the problem by myself although I believed it should be solved by dp immediately after I read the problem. The difficult and interesting part of this problem is how to arrange the state of the whole process so that the total number of state is as small as possible.
Let’s think about it. For a state transition, we need to know:

  1. How many people have got in the elevator(may have got out).
  2. Current floor.
  3. Which floors do people currently in the elevator want to go.

The only problem is third one. There are at most 4 people in the elevator and each person may go to each of the 9 floors. So there would be
$$ 10^4=10000 $$
states in total in the 3rd dimension of dp.
However, it is a little bit more than we can afford. Actually some of the states in the above are repeated. Say, person 1 goes to floor 3 and person 2 goes to floor 4 is the same as person 1 goes to floor 4 and person 2 goes to 3. So we only care about how many people go to each floor, with no need to know who. Hence the total number of state is:
$$ \binom{9}{0} + \binom{9}{1} + \binom{9}{2} + \binom{9}{3} + \binom{9}{4} = 256 $$

This enable us to get accepted.
And some technique should be adopted to represent the state. Instead of using integers, we use a 4-char-long string to represent a state. With the help of map, we can implement it nicely:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

#include <iostream>
#include <algorithm>
#include <string.h>
#include <math.h>
#include <map>

using namespace std;

#define int_ int64_t
#define pb push_back
#define mp make_pair
#define ll long long
#define ull unsigned ll
#define db double
#define INF 0x3f3f3f3f
#define MOD 1000003
#define PII pair<int, int>

const int N=2010;
int n,a[N],b[N];
map<string,int> dp[N][10];

int dfs(int idx,int cur,const string& s) {
if (idx>n&&s=="0000") { return 0; }
int& res=dp[idx][cur][s];
if (res) return res;
res=INF;
for (char c='1';c<='9';c++) {
string t=s;
int x=0;
for (int i=0;i<4;i++) {
if (t[i]==c) t[i]='0',x++;
}
int now=idx;
for (int i=0;i<4;i++) {
if (t[i]=='0'&&now<=n&&a[now]+'0'==c) {
t[i]=b[now++]+'0';
x++;
}
}
sort(t.begin(),t.end());
if (idx==now&&t==s) continue;
res=min(res,abs(cur-(c-'0'))+x+dfs(now,c-'0',t));
}
return res;
}

int main()
{
scanf("%d",&n);
for (int i=1;i<=n;i++) {
scanf("%d%d",a+i,b+i);
}
printf("%d\n",dfs(1,1,"0000"));
}