Skip to code
Skip to analysis

This is a explanation of this problem from USACO's training website. I have converted it to markdown. Please do not just copy code; you will not learn anything; at least type it out and understand so you can do it yourself in the future!

The cows have developed a new interest in scanning the universe outside their farm with radiotelescopes. Recently, they noticed a very curious microwave pulsing emission sent right from the centre of the galaxy. They wish to know if the emission is transmitted by some extraterrestrial form of intelligent life or if it is nothing but the usual heartbeat of the stars.

Help the cows to find the Truth by providing a tool to analyze bit patterns in the files they record. They are seeking bit patterns of length A through B inclusive (1 <= A <= B <= 12) that repeat themselves most often in each day’s data file. An input limit tells how many of the most frequent patterns to output.

Pattern occurrences may overlap, and only patterns that occur at least once are taken into account.

PROGRAM NAME: contact

INPUT FORMAT

Line 1: Three space-separated integers: A, B, N; (1 <= N ≤ 50).
Line 2..end: A sequence of as many as 200,000 characters, all 0 or 1; the characters are presented 80 per line, except potentially the last line.

SAMPLE INPUT (file contact.in)

2 4 10
01010010010001000111101100001010011001111000010010011110010000000

In this example, pattern 100 occurs 12 times, and pattern 1000 occurs 5 times. The most frequent pattern is 00, with 23 occurrences.

OUTPUT FORMAT

Lines that list the N highest frequencies (in descending order of frequency) along with the patterns that occur in those frequencies. Order those patterns by shortest-to-longest and increasing binary number for those of the same frequency. If fewer than N highest frequencies are available, print only those that are.

Print the frequency alone by itself on a line. Then print the actual patterns space separated, six to a line (unless fewer than six remain).

SAMPLE OUTPUT (file contact.out)

23
00
15
01 10
12
100
11
11 000 001
10
010
8
0100
7
0010 1001
6
111 0000
5
011 110 1000
4
0001 0011 1100

CODE

Java


C++


Pascal



ANALYSIS

Russ Cox

For this problem, we keep track of every bit sequence we see. We could use the bit sequence itself as an index into a table of frequencies, but that would not distinguish between the 2-bit sequence “10” and the 4-bit sequence “0010”. To solve this, we always add a 1 to the beginning of the number, so “10” becomes “110” and “0010” becomes “10010”.

After reading the entire bit string, we sort the frequency table and walk through it to print out the top sequences.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>

#define MAXBITS 12
#define MAXSEQ (1<<(MAXBITS+1))

typedef struct Seq Seq;
struct Seq {
    unsigned bits;
    int count;
};

Seq seq[MAXSEQ];

/* increment the count for the n-bit sequence "bits" */
void
addseq(unsigned bits, int n)
{
    bits &= (1<<n)-1;
    bits |= 1<<n;
    assert(seq[bits].bits == bits);
    seq[bits].count++;
}

/* print the bit sequence, decoding the 1<<n stuff */
/* recurse to print the bits most significant bit first */
void
printbits(FILE *fout, unsigned bits)
{
    assert(bits >= 1);
    if(bits == 1)	/* zero-bit sequence */
	return;

    printbits(fout, bits>>1);
    fprintf(fout, "%d", bits&1);
}

int
seqcmp(const void *va, const void *vb)
{
    Seq *a, *b;

    a = (Seq*)va;
    b = (Seq*)vb;

    /* big counts first */
    if(a->count < b->count)
	return 1;
    if(a->count > b->count)
	return -1;

    /* same count: small numbers first */
    if(a->bits < b->bits)
	return -1;
    if(a->bits > b->bits)
	return 1;

    return 0;
}

void
main(void)
{
    FILE *fin, *fout;
    int i, a, b, n, nbit, c, j, k;
    unsigned bit;
    char *sep;

    fin = fopen("contact.in", "r");
    fout = fopen("contact.out", "w");
    assert(fin != NULL && fout != NULL);

    nbit = 0;
    bit = 0;

    for(i=0; i<=MAXBITS; i++)
	for(j=0; j<(1<<i); j++)
	    seq[(1<<i) | j].bits = (1<<i) | j;

    fscanf(fin, "%d %d %d", &a, &b, &n);

    while((c = getc(fin)) != EOF) {
	if(c != '0' && c != '1')
	    continue;

	bit <<= 1;
	if(c == '1')
	    bit |= 1;

	if(nbit < b)
	    nbit++;

	for(i=a; i<=nbit; i++)
	    addseq(bit, i);
    }

    qsort(seq, MAXSEQ, sizeof(Seq), seqcmp);

    /* print top n frequencies for number of bits between a and b */
    j = 0;
    for(i=0; i<n && j < MAXSEQ; i++) {
	if(seq[j].count == 0)
	    break;

	c = seq[j].count;
	fprintf(fout, "%d\n", c);

	/* print all entries with frequency c */
	sep = "";
	for(k=0; seq[j].count == c; j++, k++) {
	    fprintf(fout, sep);
	    printbits(fout, seq[j].bits);
	    if(k%6 == 5)
		sep = "\n";
	    else
		sep = " ";
	}
	fprintf(fout, "\n");
    }

    exit(0);
}

Back to top