You are not logged in. Please login at www.codechef.com to post your questions!

×

GERALD09 - Editorial

PROBLEM LINK:

Practice
Contest

Author: Gerald Agapov
Tester: Shang Jingbo and Gerald Agapov
Editorialist: Devendra Agarwal

DIFFICULTY:

Challenge

PREREQUISITES:

Greedy.

PROBLEM:

Given an empty matrix of NxM , fill the matrix with G,C,A and T.

The matrix will be stable if it contains K different submatrices. Your aim is to fill the matrix with A,C,G and T such that the difference between the number of different submatrices in matrix and number K should be as small as possible.

Explanation:

This problem is a challenge problem. For challenge problem , you should aim at finding some approximate/partial solution using algorithms like greedy , dp , etc..

For this problem , let us break the problem in cases :

Case 1 : For small n and m where n * m<=10

This case can be solved using brute force. Try all possible matrices and then find the matrix which gives the best possible estimate. All possible matrix are 4nm , and each matrix can be checked for different submatrices in O(n2m2) time . You need to be little careful in this method because test cases may go to 100 , so use this approach only if 4nmn2*m2 <= (4 * 108) / Total_Tests

Case 2 : A more generic case with any N,M and K.

First start with all 'A' in the matrix , then try to change each element of the matrix randomly and check if it gave a better estimate or not. But checking after changing each element may lead to TLE because time complexity for that will be T n3m3 where T is the Total_Tests.
So, instead of checking after changing at each box , we can use any strategy to reduce number of checks. One simple idea is to check only log(NM) times. Start with changing first (N * M)/2 elements and then keep on decreasing the number by 1/2 . So time required is T * n2m2log(nm) <= 100 * 154 * 10 < 108

We will use greedy approach in our selection , we will pick up only those changes which will take us close to K.

Now you can still make some random iterations in which you pick any two elements of the matrix and swap them and then check for number of different sub-matrices. If we get close to K , we accept those changes otherwise we reject it.
You can try some more random strategies here : by fixing an element of the matrix and choosing the other element for swapping.

Be careful on the number of random iterations, so,let Iterations be the number of Iterations you will run your program, then Iterations * n2*m2 <= 100000000/Total_Tests must be satisfied to avoid TLE . Actually you can try by changing the constant and submitting the solution to get a better bound on number of Iterations.

I would like to thank Ju Ruo , i got this insight after reading his code :) .
I would invite all the participants of this problem to share their awesome ideas with us :).

AUTHOR'S and TESTER'S SOLUTIONS:

Author's solution to be uploaded soon.
Tester's solution

This question is marked "community wiki".

asked 14 Jul '14, 15:01

devuy11's gravatar image

4★devuy11 ♦♦
46273842
accept rate: 0%

edited 14 Jul '14, 15:24


I'd like to highlight another approach, which was apparently enough for 0.921 points (at least on 20% of test data), and is great for large K.

You may have noticed that there just 225 possible matrix sizes (actually, just 120 when N <= M), but an incredible 50 kB of allowed source size. Why not actually use this space? Of course, not by writing 50 kB of code, but by pre-computing something for all possible N_x_M and hardwiring it into the code?

That something is (let's just fix N,M) a map of matrix:score (score = number of different submatrices), with scores as evenly distributed over the range [1, maximum of scores found by 1000 times running "random generate genome, check score"] as possible. Of course, one does not simply fit many matrices into 50 kB, so we need to compress the data somehow. My idea was to just remember the first matrix (random, maximum score), and then keep on replacing all non-A nucleotides in row-major order and only remember the scores we'd obtain after each replacement. This implicit description is easy to replicate, except with much less time necessary because we don't need to actually calculste the scores.

I ran a bruteforce solution that computed this initial genome and an array of scores (printing it already in copy-pastable array form) on my computer with O(N^2M^2 log N^2M^2) time per check, it runs in slightly less than a minute. Then, I just wrote a solution that stoarts with a shitload of numbers and then has a short main(): read input, simulate changing non-As to A from the initial genome for this N,M, pick the option which gives score closest to K, write output.

Effortless solution is effortless. To anyone that took more than several hours working on this problem to get a score of at least 0.9, I suggest playing Battle for Wesnoth intensively - you'll find out how much can be optimized outside of algorithms :D

I also have a question: Were the setter/tester aware of this solution?

link

answered 14 Jul '14, 16:09

xellos0's gravatar image

7★xellos0
5.7k53876
accept rate: 10%

4

My initial idea was to precompute as many matrices as possible (using various strategies) during the contest and then "compress" them somehow into the 50 KB of allowed source code. That didn't work, of course (the "compression" into 50 KB, I mean), so I spent a lot of submissions identifying a good part of the matrices from the test cases. Thus, I could save in my source code only my answers for those matrices (I needed only 506 different matrices, which fit into 50 KB without trying too use any special compression techniques).

(14 Jul '14, 17:09) mugurelionut7★

Here is my editorial for that problem with a lot of details, it gave me 19th place overall:

http://chasethered.com/?p=221

link

answered 18 Jul '14, 03:14

pkacprzak's gravatar image

6★pkacprzak ♦♦
70483786
accept rate: 12%

Where to know the score for my solution to this problem? And How to check July challenge rating of me(global/country) easily(not your boring next button clicking system).

link

answered 14 Jul '14, 16:26

muttakyn's gravatar image

3★muttakyn
15112
accept rate: 0%

3

"not your boring next button clicking system" Nobody asked you to click next buttons and no one is obliged to provide your rating. CodeChef is amazing in what they do, if you don't like it, then this site is not for you.

(14 Jul '14, 16:55) ambarpal5★

you can go to "My submissions" to check your score to this problem. Click on your profile to see your rating/ranking which is overall. Its unnecessary to check your rank in July challenge unless you solved all the questions =)

(15 Jul '14, 17:55) lallu12★

Sir what would be the best way to find the number of sub matrices? I used PnC, but things get complicated as the types of letters increase.

link

answered 14 Jul '14, 22:03

parash93's gravatar image

2★parash93
11
accept rate: 0%

1

What's PnC?

(14 Jul '14, 22:25) xellos07★
  • using Permutations
(14 Jul '14, 23:43) parash932★

I have no idea at all how finding the number of distinct submatrices could ever involve permutations. I just tried hashing all submatrices (there are just O(N^2M^2) of them) and storing their hashes in an STL set<>.

(14 Jul '14, 23:48) xellos07★

size of submatrix vary from [1..n]x[1..m] makes the number of sub-matrices n^2.m^2!!

(15 Jul '14, 17:57) lallu12★

@xellos0 I used hashing too. I just used unordered_set instead.

(18 Jul '14, 03:50) piotrekg24★

The author's solution is not uploaded even now (after 3 years)!

link

answered 07 Jul, 07:38

ravibitsgoa's gravatar image

3★ravibitsgoa
11
accept rate: 0%

toggle preview
Preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported
  • mathemetical formulas in Latex between $ symbol

Question tags:

×11,354
×696
×581
×21
×4

question asked: 14 Jul '14, 15:01

question was seen: 4,002 times

last updated: 07 Jul, 07:38