PROBLEM LINK:
Practice
Div-2 Contest
Div-1 Contest
Author: Aleksandr Abelyan
Tester: Radoslav Dimitrov
Editorialist: Jelmer Firet
DIFFICULTY:
MEDIUM
PREREQUISITES:
Bitwise operations
Setter’s solution: Fast Fourier Transform
PROBLEM:
You are given N vertical lines and M horizontal lines. You may add one more horizontal line. Give the maximum number of squares you can form with different areas?
QUICK EXPLANATION:
Use bitsets to speed up the following \mathcal{O}(n^2) solution. Determine which distances between pairs of horizontal and vertical lines already exist. For each possible new line (temporarily) update the distances between horizontal pairs. Then count how many k are both the distance between two horizontal lines and the distance between two vertical lines.
EXPLANATION:
Before I start my explanation, I want to mention that I start explaining the idea behind the algorithm. The code given in this part is used to support the idea, but would give a TLE because it is too slow. In the section after that I will discuss how the code can be sped up using bitsets.
Idea of the algorithm
Note that a square is a rectangle with equal side lengths. Furthermore the area of a square is entirely defined by the side length. So the problem could have been rephrased as finding the maximum possible number of squares with different side lengths. A square with side length k is formed by the lines only if there are a pair of vertical lines k units apart; and a pair of horizontal lines k units apart.
Because we are not allowed to add vertical lines, all possible square sizes are already determined by the given vertical lines. Therefore it will be handy to figure out for each k if there are two vertical lines that are k units apart. We will also determine for which k there are two horizontal lines k units apart. We can find these distances between pairs of lines with the following bit of code:
// vertical[i] is true if there is a line x=i, reason for boolean array comes later
vector<bool> vertical;
//verDiff[k] is going to be true if there is a pair of vertical lines k apart
vector<bool> verDiff(width+1);
for (int i = 0; i <= width; i++) if (vertical[i]) { // lines x=i
for (int j = i; j <= width; j++) if (vertical[j]) { // lines x=j
verDiff[j-i] = true;
}
}
// do the same for horizontal lines
Now we have two boolean arrays verDiff
and horDiff
that hold whether the vertical or horizontal lines would allow a square of size k to exist. If we want to count the number of different sized squares we loop trough both arrays and count how often verDiff[k] = horDiff[k] = true
. But we can still add a line to increase the number of squares.
A quick modification that allows us to calculate the answer is to consider every possible line y=c, temporarily add it and calculate the number of squares we end up with. However this would become way too slow, because now we have three nested loops (over x_1, x_2 and c). To get back to quadratic complexity we notice that a lot of things don’t need to be recalculated. For example verDiff
doesn’t depend on the newly added line. Furthermore at most M of the values in horDiff
change when you add the new line, so we can just determine which these are.
So the algorithm we have consists of the following steps
- Calculate
verDiff
for the vertical lines - Calculate
horDiff
for the original horizontal lines - Consider each horizontal line
y=c
- Determine which items in
horDiff
will additionally be set to true (namednewHorDiff
) - Determine the number of squares by looping trough
verDiff
,horDiff
andnewHorDiff
- Determine which items in
This is done in the following snippet:
// calculate verDiff and horDiff
int maxSquare = 0;
for (int c = 0; c <= height; c++) { // consider y=c
vector<bool> newHorDiff; // the distances from the new line to the others
for (int i = 0; i <= height; i++) if (horizontal[i]) { // lines y=i
newHorDiff[abs(i-c)] = true;
}
int numSquare = 0;
for (int i = 0; i <= height; i++) {
if (verDiff[i] and (horDiff[i] or newHorDiff[i]) ) {
numSquare++;
}
}
maxSquare = max(maxSquare, numSquare);
}
// print maxSquare-1 to ignore the 0-area square
Making it faster
Now that we have got the idea of the algorithm, there is still an issue: it’s too slow. In my explanatory snippets there are nested for loops, each of which can be executed up to 10^5 times. This means the algorithm is \mathcal{O}(n^2), where we take n the largest of W,H,N,M. Using the estimate of 10^9 operations per second, it seems that this algorithm is not feasible. However, we can do more in a single operation. Let’s introduce bitwise operations.
Bitwise operations are like the familiar boolean operations, but they work on 64 bits at a time (maybe 32 based on OS and hardware). There are also two additional bitwise operations: left-shift and right-shift. You can read up on these on Wikipedia if you are not familiar with them.
To make use of bitwise operations we will store the boolean arrays in a bitset
instead. bitset
is a datastructure from the C++ standard that allow us to use bitwise operations on more than 64 bits. Other programming languages might have similar structures predefined. Now we just need to transform the snippets to use bitsets.
For the first snippet we are going to replace the inner for loop with bitwise operations. First we can replace the assignment of true
by an or operation with the condition of if(vertical[x2])
, so we will get verDiff[x2-x1] |= vertical[x2]
. Next we will increase both indices to get verDiff[x2] |= vertical[x2+x1]
. And finally turn the for-loop into a bitwise operations using, where you replace the +x1
in the index with a left-shift. We then turned the snippet into the following code:
// vertical[i] is true if there is a line x=i, reason for boolean array comes later
bitset<mx> vertical;
//verDiff[k] is going to be true if there is a pair of vertical lines k apart
bitset<mx> verDiff;
for (int i = 0; i <= width; i++) if (vertical[i]) {
verDiff |= (vertical >> i)
}
// repeat the same for the horizontal lines
For the second segment I need to introduce a new bitset. This will be revHorizontal
, which is the reverse of horizontal
. So revHorizontal[i]
is set if there is a horizontal line y=\text{height}-i. We need this new bitset because it also allows us to update distances to lines below the new line quickly. horizontal
and revHorizontal
will be shifted by y and \text{height}-y respectively, so that after shifting horizontal[k]
is set if there is a line at y=c+k, and similarly revHorizontal[k]
will be set if there is a line at y=c-k.
Next we need to calculate for how many squares we can form. We can just translate the boolean operations of the snippet into bitwise operations to get a bitset where bit k is set if there is a square with side-length k. The C++ bitset
contains a function count
that counts how many bits are set, so the number of squares. The second snippet transforms into the following snippet:
// calculate verDiff and horDiff
int maxSquare = 0;
for (int c = 0; c <= height; c++) {
bitset<mx> newHorDiff;
newHorDiff |= (horizontal >> c);
newHorDiff |= (revHorizontal >> (height-c));
int numSquare = (verDiff & (horDiff | newHorDiff)).count();
maxSquare = max(maxSquare, numSquare);
}
// print maxSquare-1 to ignore the 0-area square.
Now that all inner for loops have been replaced by bitwise operations the algorithm is fast enough to pass the time limit. Intuitively by using bitsets we turned an algorithm that used \mathcal{O}(n^2) operations into an algorithm that uses \mathcal{O}(\frac{n^2}{64}) operations. (Though note that because of the definition of big O the algorithm using bitsets is still \mathcal{O}(n^2))
You can look at my full code below which has a lot more comments.
ALTERNATE SOLUTIONS:
The setter (Aleksandr) used Fast Fourier Transfrom (FFT) to determine the already existing distances between pairs of horizontal or vertical lines. This is a method that uses some complex (literally) mathematics to determine for multiply frequencies how strong an input array “resonates” with that frequency. In terms of this problem, that would happen if there is a pair of lines that frequency apart. 3blue1brown has a good video that explains what a Fourier Transform is. In practice I have not yet encountered a problem for which a FFT is required for the solution. If any of you do, please tell me about it in the comments.
The tester (Radoslav) used the same approach as me. However he constructed the revHorizontal
on the fly instead of beforehand.
SOLUTIONS:
Setter's Solution
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
typedef unsigned long long ull;
typedef long double ld;
//#pragma GCC target("avx2")
//#pragma GCC optimization("unroll-loops")
//#pragma GCC optimize("O2")
//#pragma GCC optimize("O3")
#define FOR(i,a) for (int i=0;i<(a);++i)
#define FORD(i,a) for (int i=(a)-1;i>=0;i--)
#define FORT(i,a,b) for (int i=(a);i<=(b);++i)
#define FORTD(i,b,a) for (int i=(b);i>=(a);--i)
#define trav(i,v) for (auto i : v)
#define all(v) v.begin(),v.end()
#define ad push_back
#define fr first
#define sc second
#define mpr(a,b) make_pair(a,b)
#define pir pair<int,int>
#define make_unique(v) v.erase(unique(all(v)),v.end())
#define fastio ios_base::sync_with_stdio(false); cin.tie(0); //cout.tie(0);
#define srng mt19937 rng(chrono::steady_clock::now().time_since_epoch().count())
#define y1 EsiHancagorcRepa
const int N=1e5+4;
const ll MOD=1e9+7;
const ll MX=1e16;
using cd = complex<ld>;
using vcd = vector<complex<ld> >;
const ld PI = acos((ld)-1);
int rev[3 * N];
cd roots[3 * N];
void fft(vcd& a)
{
int n = a.size();
for (int i = 0; i < n; ++i)
if (i < rev[i])
swap(a[i], a[rev[i]]);
for (int len = 1, step = n / 2; len < n; len <<= 1, step >>= 1)
for (int st = 0; st < n; st += len + len)
for (int i = 0, root = 0; i < len; ++i, root += step)
{
cd u = a[st + i], v = roots[root] * a[st + len + i];
a[st + i] = u + v;
a[st + i + len] = u - v;
}
}
void inv(vcd& a)
{
fft(a);
reverse(a.begin() + 1, a.end());
for (cd& x : a)
x /= a.size();
}
void conv(vcd& a, vcd& b)
{
int n = 1, h = 0;
while (n < a.size() + b.size())
n *= 2, ++h;
while (a.size() < n) a.push_back(0);
while (b.size() < n) b.push_back(0);
rev[0] = 0;
for (int i = 1, high = -1; i < n; ++i) {
high += !(i & (i - 1));
rev[i] = rev[i ^ (1 << high)] | (1 << (h - high - 1));
}
ld sector = 2. * PI / n;
for (int i = 0; i < n; ++i) {
roots[i] = cd(cos(sector * i), sin(sector * i));
}
fft(a); fft(b);
for (int i = 0; i < n; ++i)
a[i] *= b[i];
inv(a);
}
bitset<N> tv,tn,unused;
int x[N],y[N];
int main(){
srng;
fastio;
int w,h,n,m;
cin>>w>>h>>n>>m;
vcd xx(w+1,0),yy(h+1,0),xxr(w+1,0),yyr(h+1,0);
FOR(i,n){
cin>>x[i];
xx[x[i]]=1;
xxr[w-x[i]]=1;
}
conv(xx,xxr);
FORT(i,w+1,xx.size()-1){
if ((int)(xx[i].real()+0.5)){
unused[i-w]=1;
}
}
FOR(i,m){
cin>>y[i];
tv[y[i]]=1;
yy[y[i]]=1;
yyr[h-y[i]]=1;
}
conv(yy,yyr);
int ans=0;
FORT(i,h+1,yy.size()-1){
if ((int)(yy[i].real()+0.5)){
if (unused[i-h])ans++;
unused[i-h]=0;
}
}
int pat=0;
FOR(i,h+1){
if (!tv[0]){
pat=max(pat,int((unused&(tv|tn)).count()));
tv=tv>>1;
tn=tn<<1;
}
else{
tv=tv>>1;
tn=tn<<1;
tn[1]=1;
}
}
cout<<ans+pat<<endl;
return 0;
}
Tester's Solution
#include <bits/stdc++.h>
using namespace std;
template<class T, class T1> int chkmax(T &x, const T1 &y) { return x < y ? x = y, 1 : 0; }
const int MAXN = (int)1e5 + 42;
int w, h, n, m;
int a[MAXN], b[MAXN];
bitset<MAXN> hor, ver, emp;
void read() {
cin >> w >> h >> n >> m;
for(int i = 0; i < n; i++) {
cin >> a[i];
hor[a[i]] = 1;
}
for(int i = 0; i < m; i++) {
cin >> b[i];
ver[b[i]] = 1;
}
}
void solve() {
bitset<MAXN> dp1 = emp, dp2 = emp;
// Can be replaced with FFT.
for(int i = 0; i < n; i++) {
dp1 |= hor >> a[i];
}
// Can be replaced with FFT.
for(int i = 0; i < m; i++) {
dp2 |= ver >> b[i];
}
bitset<MAXN> available = dp1 & dp2;
bitset<MAXN> pref, suff = ver;
pref[0] = ver[0];
int ans = available.count();
for(int i = 0; i <= h; i++) {
if(!ver[i]) {
chkmax(ans, (available | ((pref | suff) & dp1)).count());
}
pref <<= 1;
suff >>= 1;
pref[0] = ver[i + 1];
}
// -1 cause len=0 isn't a square
cout << ans - 1 << endl;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
read();
solve();
return 0;
}
Editorialist's Solution
#include <iostream>
#include <vector>
#include <utility>
#include <tuple>
#include <algorithm>
#include <bitset>
using namespace std;
typedef vector<int> vi;
typedef vector<vi> vvi;
const int mx = 100001;
int main() {
int width, height, numVer, numHor;
cin >> width >> height >> numVer >> numHor;
// bitsets for vertical and horizontal lines
// vertical[i] is set if there is a line y=i
// horizontal[i] is set if there is a line x=i
// revHorizontal[i] is set if there is a line x=height-i
bitset<mx> vertical, horizontal, revHorizontal;
// read in x-coordinates of vertical lines and store in bitset
for (int i = 0; i < numVer; i++) {
int verLine;
cin >> verLine;
vertical.set(verLine);
}
// read in y-coordinates of horizontal lines and store in bitset
for (int i = 0; i < numHor; i++) {
int horLine;
cin >> horLine;
horizontal.set(horLine);
revHorizontal.set(height-horLine);
}
bitset<mx> verDiff, horDiff;
// iterate over each of the vertical lines (x=i)
for (int i = 0; i<= width; i++) if (vertical[i]) {
// and for each line (at x=j) to the right of it set bit j-i of verDiff
verDiff |= (vertical >> i);
}
// iterate over each of the horizontal lines (y=i)
for (int i = 0; i<= width; i++) if (horizontal[i]) {
// and for each line (at y=j) above it set bit j-i of horDiff
horDiff |= (horizontal >> i);
}
int maxSquare = 0;
// loop over all possible y-coordinates of the new line
for (int c = 0; c <= height; c++) {
// newHorDiff will store the newly formed distances,
bitset<mx> newHorDiff;
// update the distances between the new line and those above it
newHorDiff |= (horizontal >> c);
// update the distances between the new line and those below it
newHorDiff |= (revHorizontal >> (height-c));
// calculate how many squares can be created
// there is a square of a particular side-length if that side-length
// is the distance between two horizontal lines,
// and the distance between two vertical lines
int numSquare = (verDiff & (horDiff | newHorDiff)).count();
maxSquare = max(maxSquare, numSquare);
}
// -1 to ignore a 0-area square.
cout << maxSquare-1 << endl;
return 0;
}