CS074: The Digital World

Lab 1 Solutions




Exercise 1-File Sizes

Review the material in the first lecture on powers of two and the exact meaning of kilobyte, megabyte, etc.

(a) Examine the sizes of several files, both large (larger than1MB in size), medium (10 or 20 KB) and small (less than 2KB). You can use the TextEditor to create the small files, as described above. Note down the results you see--you should get results from at least three different files. Pay close attention to the numbers you see for "size on disk".  You should also indicate what the file is (e.g., an audio file in your iTunes library, a Microsoft Word document, the program file that contains Microsoft Word itself, etc.)

Answer:  I did this on a Mac. The brief text file described in (c) below has 43 bytes; the same brief text saved as a Microsoft Word .doc file has 22,528 bytes, and the file containing the application code for Microsoft Word itself contains 62,586,055 bytes.  The "size on disk" for each of these files was, respectively, 4KB, 24KB, 63.7KB.

(b) Describe as precisely as you can the  relationship between (i) the exact size in bytes; (ii) the size expressed in KB or MB (if you're using Windows); (iii) the "size on disk" expressed in KB or MB. Suppose a file's exact size is 8100 bytes.  What do you think will be its "size on disk" expressed in KB?  What if the exact size is 8200 bytes?  Explain how you found the answer.

Answer:  The size on disk appears to be the file size rounded up to the nearest multiple of 4KB.  (The reason is that the storage on the disk itself is divided into 4KB sectors.)  To test this theory, we should consider what happens with a file that is exactly 8100 bytes long, and one that is 8200 bytes long.  Since 8 KB = 8192 bytes, a file with 8100 bytes should occupy 8KB on disk, but one with 8200 bytes should occupy 12KB on disk.  I tried this experiment adapting the procedure outlined in (d) below to make large text files, and found that the size on disk was as predicted.

Now, if you haven't already done so, use the Text Editor to create a small file.  Make sure to include some blank spaces, several different lines, and some tabs, but don't go overboard, as you will need to count all the characters you type.  When you've finished, save the result (using the option "Save" from the "File" menu) in the format "US ASCII-Basic Latin Characters" with the name "file1".

(c) What is the exact relationship between the number of visible characters, spaces, tabs, and lines in the file, and the exact file size in bytes?  Does a space count for one byte? no bytes?  more?  Does a tab (which looks like 6 or so spaces) count as 6 bytes? less? more? What about a blank line? You may have to experiment a little, revising the file, to be sure of the answer. (This is not a trick question; in the end, the answer is very simple and probably what you thought.)

My text file looks like this:

This is a sample file.

I hope    you like it.


Answer :The exact file size is 43 bytes.  The file contains 33 visible characters (the letters and the periods), 7 spaces, 1 tab, and 3 lines altogether.  The two 'Returns' I typed to advanced to the next line added one byte to the file, and each tab and space added one byte, so the total is 2 + 1 + 7 + 33 =43.  Briefly, each character is a single byte, but 'character' includes such non-printing characters as spaces, tabs and new lines.

(d) Use the text editor to make a text file that is more than 1 MB in size, with as little typing as possible.  Tell me how you did it, but PLEASE don't mail me the file!

Answer: Type a long line of text, say 100 characters.  Then repeatedly select the whole document, copy, and paste.  Each time you do this, the document size doubles, so that after n repetitions, the size of the document is 100 x 2n bytes long.  We get 1 MB when 2n is larger than 10,000 (actually, somewhat larger even than that, since 1MB is more than 1 million bytes), so 14 repetitions are sufficient.  If you really want to do as little typing as possible, you can start with a single character, and do 20 repetitions.

Exercise 2 -Make a Web Page



I'll just give the answers to (d) and (e), since these are really the only places where you have to figure something out.


One of the instructions on the page specifies the background color.  The  the color  as six hex digits, in this case EEEEEE.  But the way to read this is as the hexadecimal representations of three separate byte values: EE EE EE.  These represent, respectively, the red, green and blue components of the color. Higher means brighter, and when all three components are the same, you get some shade of grey.  FF FF FF (255 255 255 in decimal) gives you white. EE EE EE (224 224 224 in decimal) is a very light grey, FF 00 00 (255 0 0) is a very  rich red.   (I'll show you in class how to convert the hex digits to decimal, and vice-versa.) We'll have more to say in a couple of weeks about this way of representing colors.


(d)  Change the Background Color. Edit the hex encoding of the background color so that it the background becomes yellow.  (It's not obvious---not to me, at any rate--how to get yellow by mixing different proportions of red, green and blue. You can come up with the answer by guessing and trying it out.  Another way is to hunt around in some standard application like Microsoft Word or Power Point that contains drawing tools or ways to change the color of text.  These include some way of specifying a "custom color" in terms of its red, green and blue components.)

Answer:  Yellow, somewhat unexpectedly, is obtained by mixing green and red, so setting the text color to FFFF00 will do it. You can reason it out like this, but you can also find the hex encodings of many colors in applications that contain drawing or painting tools.  After poking around PowerPoint on my Mac I found FFFF00 as the encoding of bright yellow, and FFFF33, FFFFF9 for more muted yellows.

(e) How do you...? Suppose you wanted the text of your web page to contain the symbols < and >.  The problem is that if you just typed them as text, your browser would try to interpret them as part of tags and get all fouled up (try it and see).  So how do you do it?  You can find the answer by perusing an HTML manual, which is fine, but there's also a quicker way to figure it out.  Whichever way you choose, revise the text of your web page so that it contains these symbols.  In the writeup of your solutions, explain how you did it.

Answer: I just used the View HTML Source option in my browser for this page that contains the lab exercise, and looked for how the paragraph above was encoded!  What I saw was:

Suppose you wanted the text of your web page to contain the symbols &lt; and &gt;.

evidently, &lt; is used to encode <, and &gt; is used to encode >.  If you place these encodings in your HTML, you'll get the desired result. (The letters lt stand for 'less than', and 'gt' for greater than.)


I did not include 'Answers' to the Fun and Games Exercises.  Some of you had amusing theories about the ESP program!!