CS 074 The Digital World

Fall 2007
Computer Science Department
The College of Arts and Sciences
Boston College

About Syllabus Texts Problem Sets
Staff Resources Grading Projects

Problem Set 3: Editing Binary Files; Wave Files


Assigned: Monday September 24, 2007
Due: Monday October 1, 2007
Points: 10

Overview

In this problem set you will use the BinEd program to make meaningful modifications to text and audio files.  In problem set 4, you will experiment with image files.  You should review the in-class lab exercise, particularly the effects of formulas like [N + 2], [N / 2], [N] / 2, [2 * N], [N % 50], etc.

There is a sample text file and there are two sample wave files (a five-second flute solo and a one-second recorded voice) provided with this assignment:

When you transform or create a text file, you should save it using the "Save as Binary File" option on the Load/Save page of BinEd.

When you transform a wave file, you should first remove the 44-byte header (use the formula [N + 44]). This will not appreciably affect what you hear when you play the sound in BinEd, but the header could get in the way when you transform the file. After you have applied whatever formulas the problem requires, and are satisfied with the sound in BinEd, you should save it using the "Save as Sound File" option. This will attach an appropriate header to the file, so the sound clip can be played by any software that is capable of handling .wav files.

There are 12 problems in this problem set. Some of these are straightforward reworkings of things we did in the in-class exercise, some require you to go a little bit farther, and a couple are quite challenging. Each problem has been assigned a point value based on the problem's difficulty. We have provided hints for the more challenging ones. In each of these problems, the goal is to produce the desired modification using a formula or sequence of formulas in BinEd. You are not expected to do all, or even most, of  the problems. A "perfect" score on the assignment is 25 points. The problem set will be scaled to 10 points by multiplying the total points by .4. Points in excess of 25 will be noted, and may be put in the bank to pay back a future deficiency in a problem set score. Note that there are several very challenging problems at the end!

Problems

  1. (5 points) Change all the letters in the text file to lower case. (Compare this to the very similar problem in the in-class exercise).

    Answer: Apply the formula [n]+32 to all cells with values in the range 65 to 90.  (This is the range of ASCII codes of the upper-case letters.)

    -->

  2. (5 points) Change all the a's, both lower- and upper-case, to the underscore character _.  (You'll need to look up or otherwise figure out the ASCII encodings of all three of these characters in order to do the problem.)

    Answer: The ASCII code for the underscore character is 5F(hex), or 95 in decimal.  So apply the formula 95 to all cells with value 65, and then to all cells with value 97.

    -->

  3. (8 points) Change all the characters except the a's, both lower-and-upper case, to 'X'.   The resulting file should consist mostly of X's, punctuated by an occasional 'a' or 'A'. (Changing everything to 'X' is easy.   You may have to apply this to several separate ranges of values in succession and look up the appropriate ASCII codes.)

    Answer: The ASCII code for 'X' is 88.  Apply the formula 88 to all cells with value in the range 0 to 65, then 66 to 96, then 98 to 255.

    -->

  4. (10 points) Change all the lower-case letters to upper-case, and vice-versa.  (So, "Edgar Allen Poe" will come out as "eDGAR aLLEN poE".)  Observe that it doesn't work to   first change all the lower-case letters to upper-case, and then change all the upper-case letters to lower-case, since the end result will be to have changed all of the letters in the file   to lower-case.  You must first prepare the ground by moving the upper-case letters out of the way.  To do this, note that no character has an ASCII code greater than 127, although all the values between 128 and 255 are legitimate values for a byte....think about this one.

    Answer: Apply the formula [n]+63 to all values in the range 65 to 90.  Then [n]-32 to all values in the range 97 to 122, then finally [n]-31 to all values in the range 128 to 154.

    -->

  5. (18 points total) We already saw the formulas for (a) and (b) in the in-class exercise.

    1. (5 points) Make the sound play twice as fast. (This will raise the pitch an octave, and cut the length of the file in half.)

      Answer: Apply [2*n] to every cell. Listen.

      -->

    2. (5 points) Make the sound play half as fast. (This will lower the pitch an octave, and double the length of the file.)

      Answer: Apply [n/2] to all cells in the range 0 to 1 less than twice the file length. For instance, with a file of length 10000 you would apply this formula to all the cells with line numbers in the range 0 to 19999. Listen.

      -->

    3. (8 points) If you applied the transformation (a) followed by (b), it should restore the sound to its original state; likewise if you do (b) first and then (a). But is that really true?  Does it matter which order you do these in? (See the instructions below about how to hand in your answer to this question).

      Answer: Applying (b), then (a) really does restore the original samples; but applying (a) then (b) throws out the odd-numbered samples and duplicates the even-numbered ones.  The effect is to halve the sampling rate, which results in perceptibly lower sound quality. Listen.

      -->

  6. (8 points) Make the sound play backwards.  (Think about this--if the file were 10 bytes long, then the value in line 0 should be replaced by the value in line 9, the value in line 1 by the value in line 8, the value in line 2 by the value in line 7, and in general the value in line x by the value in line 9-x.  Use this observation to find the correct formula.)

    Answer: Apply the formula [r-n], where r is one less than the file length. For example, if the length is 10000, apply [9999-n] to all cells. Listen.

    -->

  7. (8 points) Interchange the first and second halves of the musical passage. (Try first appending a copy of the first half to the end of the passage, then removing the first half.)

    Answer: Let's say for sake of definiteness that the file is 10,000 bytes long.  First apply [n-10000] to all cells with line numbers in the range 10000 to 14999, then [n+5000] to all cells. Listen.

    -->

  8. (10 points) Make the sound gradually softer---that is, do a slow fade from the beginning to the end. (Multiply the value in line N by a factor that is equal to 1 when N=0, and decreases steadily to become 0 for the last line of the file.)

    Answer: Let's say the length is 10000.  The idea is to multiply [n] by (10000-n)/10000, but you have to be careful about integer division.  Use ((10000-n)*[n])/10000. Listen.

    -->

The next three exercises ask you to create a synthetic sound, rather than modify a prerecorded sound. This is what we did with the Bart Simpson blackboard phrase in the in-class exercise. Another, very simple example, is to apply the formula n % 256 to the lines with numbers in the range 0 to 11024. This produces a sequence of values that repeats every 256 samples--that works about to be about 40 cycles per second.

  1. (8 points) Create a 400 hertz square wave 3 seconds in duration. This will spend 1/800 sec. at a single amplitude value, then 1/800 sec. at a different value, and then repeatedly oscillate between these two values.You can choose any two values between 0 and 255 -- I recommend 100 and 200.  (Since there are 11025 samples per second, 1/800 of a second is 11025/800, or approximately fourteen samples long.)

    Answer: 400 hertz is 400 cycles every 11025 samples, or roughly 28 samples in each cycle (28x400=11200).  So we can apply the formula 0 to cells 0 to 13, the formula 200 to cells 14 through 27, and finally the formula [n%28] to all cells up to 33074. Listen.

    -->

  2. (15 points) Square waves have a kind of harsh tone that we typically associate with electronic equipment.  You get a more musical tone with a sine wave. Try to make a 400 hertz sine wave 3 seconds in duration.

    HINT: There is no sine function built in to the formula language of BinEd, but this problem can be done by looking up only seven values of the function.   Here is a graph of the sine function, showing three complete cycles and part of a fourth:

    Now you will need to scale this in two different ways:  The sine function itself takes on values between -1 and 1. If you want it to take on values between, say, 100 and 200, you should use 150+50*sin(x).  To scale it horizontally, you should note that a complete cycle will correspond to 1/400 sec, or about 28 samples, but you only need to find values of the sine function for the first quarter of a cycle (the rising portion of the graph all the way to the left) . Use the values of sin(x) at x=90/7 degrees, 2*90/7 degrees, ...90 degrees, scaled appropriately. The remaining values can be filled in using the symmetry in the graph.

    Answer: The values of the sine function at 90/7 degrees, 2*90/7,...,90 degrees are 0.2225, 0.434, 0.623, 0.781, 0.901,0.975,1.I use 100+100*sin(x) because it's easier to figure out, and put 122,143,162,178,190,198,200 in cells 0 through 6.  Then put [13-n] in cells 7 through 13 and 100 in cell 14.  Then 200-[28-n] in cells 15 through 28, 100 in cell 29, and [n%30] in the first 30,000 cells.  Listen and compare to the square wave.

    -->

  3. (15 points) Create sound  that gradually changes its pitch over the course of 3 seconds from about 100 hertz to about 1000 hertz.

    HINT: Here's an idea to get you started--the formula (n*n)%256 takes on values between 0 and 255 for samples 0 to 15, then values between 0 and 255 again for samples 16 to 22, then again for samples 23 to 27.  The overall effect is of roughly the same values repeated over and over again, but with progressively shorter cycles. The frequencies and duration for this particular formula are all wrong, and it probably won's sound like much of anything at all, but if you tweak the formula appropriately, you can get the right effect.

    Answer: This is the only problem where I resorted to more complicated math.  (A good lesson for those of you who have taken calculus and wondered what the heck it was good for.) I want something that repeatedly increases from 0 to 255, but at 0 seconds I want 100 repetitions per second, and at 3 seconds I want 1000 repetitions per second.  A wave that repeatedly goes from 0 to 255 at a constant frequency f  rises 256 units in 11025/f samples, and thus has slope f*256/11025.  So a wave whose frequency changes as required will havel slope 25600/11025, or about 2.5 when n=0, and about 25 when n=33075 (3 seconds).  So we sill look for a function whose slope at n is:

    
         2.5+22.5*n/33075
        

    and take its remainder upon division by 256.  That's where the calculus comes in --- we have to find a function whose derivative is this.  Hmm, do I remember how to do that? I get:

    
         2.5 * n + 11.25 * n * n / 33075
        

    Since I'm not worried about things being so exact, I'll write the formula as:

    
         (3 * n + (11 * n * n) / 33075) % 256
        

    I'd love to know if someone had a simpler solution! Listen.

    -->

  4. (20 points) Lower the pitch of the sound by one octave without changing the speed.

    HINT: It is very difficult to do this well, but here is a method that gives adequate results:  Imagine the time scale of the audio clip divided up into segments about 10 to 20 milliseconds long---i.e. roughly 100 to 200 samples in length.  If we remove every other segment, we will hear something like the original  passage twice as fast without a change in pitch---this will be true if  the pitch of the original passage is relatively high, a least a few hundred hertz.  If we then slow the result down by a factor of two as in problem 5, we will  restore the original speed but lower the pitch.

    Here's how the math works: If we use segments that are 100 samples long, then line n is at position n%100 in segment number n/100.  Conversely, the value at positon j in segment i is at line number 100*i+j.  The strategy asks you to replace the value at position j in segment i by the value at position j in segment 2*i. You should experiment with constants other than 100 to get the best results.

    I got the best results with the "fascinating clip" with 20 ms slices. The procedure outlined above says to replace the value in line n by the value in line [2*(n/200)+(n%200)].  This should give a speedup without a pitch change, although there is some steady low pitched noise at 55 hertz.  Listen.

    We now apply [n/2] to slow it down. Listen, and compare that to the pitch change we did earlier.

    What to Hand In

    For each problem that you do, prepare the modified text or wave file, giving them names like Problem3.txt and Problem7.wav.  In addition, you should prepare a text document explaining how you did each problem.  Usually, the explanation will consist of a formula along with a restriction (applied only to line numbers in a certain range, or only to values in a certain range); in some cases you will have make two or more such transformations to achieve the desired effect. In 5(c) there is a followup question, that you should also answer in the text document.

    Place the text document, along with all the files you create, in a folder named Lab 2-Your Last Name, create the zipped archive as you did in the previous labs, and submit through WebCT.