Saturday, October 29, 2011

Exercise


A bit off-topic. But you might find this handy if I happen to know things about Excel that you don't. Then again, if you know an easier way to do something I have done here, please do let me know.


Jazz recently provided a great link for CPI numbers going all the way back to 1913. Monthly numbers, from a strong source: the Bureau of Labor Statistics.

Trouble is, each year's numbers are all on one row. All 12 months, and two year-values besides. Then the next year's numbers are on the next line. This is not a useful format for graphing, in my experience. I want the numbers all in one column.

This post is about what I came up with to put all those CPI numbers into one column, in order, with the minimum amount of repetitive work and the minimum risk of error.

I did all this in OpenOffice Calc, which is free, which is why I have it. It has its own quirks, but it does work a lot like Excel. What I've done here will work in Excel, if you've got it.

1. I went to the BLS link and saved the page. (In Firefox, on the Firefox menu, click Save Page As.... Firefox wants to save the page as a text file, which is fine.)

2. I saved it to the desktop. (I save everything to the desktop. Everybody laughs when they see how littered my desktop is. You can save it wherever you want. But when you can't find it, I'll laugh at you.) I right-clicked on the file and from the pop-up menu (I use Windows) clicked Open With, and from the sub-menu OpenOffice.org Calc. (Something similar should work if you have Excel installed. Worst case, open Excel, click FILE:OPEN, change the file type to include TXT files, and select the BLS file. (Filename = cpiai.txt). Something like that.)

3. I had to fiddle with the Text Import window that opened up. But I got exactly what I wanted on the third try: The CPI data from BLS, with a year on each row and a column for each month. Not what I wanted for a final product, but what I wanted to start with.

4. Remember to save the file as an Open Office spreadsheet, or an Excel spreadsheet. Don't leave it as the default type, a text file, because you will lose all your improvements when you save and close the thing.

This doesn't sound like economics at all, does it. Sounds like my other passion, fiddling with computers. While Rome burns.

In Open Office (or Excel. Whatever) now you have a copy of the BLS CPI data. All you have to do now is make it useful. This is where the power of a spreadsheet can really save you some work.

5. I renamed the worksheet Source and and made a copy of the sheet to make my changes in. (Yeah it is easy enough to go get the thing again, but is is easier just to make another copy of the Source sheet if I need it.)

6. I narrowed down the columns so I could fit more on my screen. (My monitor is turned 'portrait' so typically I can see a lot of rows and not a lot of columns.)

7. I inserted four columns to the left of column A, making some blank space at the left side of the worksheet. I left a couple blank cells at top (just out of habit; often, they come in handy). In cell A3 I entered Date and in cell B3 CPI-U.

8. Now the fun begins. You know, the repetitive part. And figuring out how to minimize the repetition. In cell A4 (one row down from the cell where I typed Date) I entered Jan 1913 which is of course where the BLS numbers start. Below it I typed Feb 1913 and below that Mar 1913. Then I selected those three cells and dragged'em down the screen, expecting it to automatically fill in Apr 1913 and the rest of the months of 1913, and beyond.

9. It didn't work. So I typed in the rest of the months for 1913, always using 3-letter abbreviations (for no reason, except I thought it might be good to keep the lengths all the same. Turned out, it was).

10. When I got to January 1914 I stopped. Obviously there had to be a better way than typing every month-and-year for a hundred years' worth of months.

11. With the cursor sitting in cell A16, I looked the situation over. I knew I wanted the letters JAN from cell A4, and a space, and the yearvalue from cell A4 with 1 added to it. And actually, I wanted that in every cell from here on down. Not JAN, but the month-name from 12 rows up, and the yearvalue from 12 rows up with 1 added to it.

12. I can do that. Actually, I couldn't, because Open Office Calc wants semicolons between the numbers, where Excel wants commas as I recall. But the help was helpful and I figured it out long before I would have been done typing the next ten years of numbers manually, and it worked. You can see the formula I used in the image below.

13.

14. In cell A16 I typed an equal sign (which tells the computer it will be doing a calculation) followed by LEFT(A4;4) (which tells the computer to take the first 4 characters from the left end of the text in cell A4).

Next I hit the SPACEBAR once and then SHIFT-7 for the ampersand (&) and the SPACEBAR once again. Pretty sure I need spaces there. It might work without them, but I don't have to care about that. What the ampersand does is sticks chunks of text together. What it will do for me is stick the three letters JAN and the fourth character, a SPACE, into cell A16, followed by whatever I put after the ampersand. Which I didn't tell you about yet.

See how I am taking the LEFT 4 characters here? And because I was clever enough to use all three-letter month names, the 4th character will always be a SPACE. And as a matter of fact, the 5th character will always be the first number of the four-digit year value. And (as you might have guessed) the year value is what comes next in the formula.

By the way, this may seem like a lot of work. But it is a lot less work than typing in all those month names and year numbers by hand.

So. After the ampersand we want the yearvalue. But in cell A16 I want the next year after the yearvalue in cell A4. I want 1914, not 1913. So (as you can see in the formula in the image above) after the ampersand I put a 1 and a plus sign. That will add 1 to something. So now all I have to do is grab the yearvalue from cell A4 and I'll be done.

To get the yearvalue from cell A4 I typed MID(A4;5;99) (Remember, semicolons between the numbers if you're using Open Office Calc.) What that does is it starts somewhere in the MIDdle of cell A4 -- with the 5th character, actually, because that's what I asked for -- and gives me the next 99 characters or less. I use 99 as a rule, because it is easier than counting the maximum number of characters I might possibly need. Though in this case that number would I suppose be four. But 99 works. And I don't have to worry about what else might work.

So that's it. That's the whole formula. Now I enter the formula into cell A16 (by hitting the ENTER key)((boy this is really LCD isn't it?)) and make sure it gave me the result that I want: JAN 1914.

15. Now it gets easy. With the cursor in cell A16, type CTRL-C to copy the formula. Then press the down-arrow and type CTRL-V to past a copy of the formula into cell A17. FEB 1914. Try a few more to make sure it works. Then select a whole lot of cells in column A and press CTRL-V to past the formula into all of 'em at once.

16. Bingo. The date-values are done.

17. Heh. That was the easy part.

18. I worked out a way to copy the CPI numbers, too, but I didn't get it right away with the 1914 numbers. But to simplify this story, we will pretend that I did.

19. Go up to the top of the spreadsheet again. Click on cell B4, which is just below the text CPI-U that you typed in step 7. Just to the left of the selected cell it should say JAN 1913. With the mouse pointing at cell B4, click and hold down the left mouse button and drag the mouse down until you are pointing at cell B15. Just to the left of that cell it says DEC 1913. And release the mouse button.


So what you have done is, you have selected the 12 cells where you want to put the CPI values we got from the BLS, which should be over to the right and maybe a few rows down from the selected cells.

20. The selected cells are stacked one above the other in a column. The numbers we want are all next to each other in a row. What we have to do is transpose them.

21. This probably sounds like a lot of work. To transpose one row at a time. Yeah, don't worry about it. Spreadsheets are useful tools.

22. With those dozen cells selected, type an equal sign (again, telling the computer that it will have to figure something out) followed by the word TRANSPOSE and SHIFT-9 for the open-parenthesis character. Now (with no mouse buttons pressed) find the BLS numbers for 1913. Point the mouse to the CPI value for January 1913, then press and hold down the left mouse button. Slowly drag the mouse over to the right (holding the mouse button down) until the mouse is pointing at the CPI value for December 1913. Release the mouse button. Press SHIFT-0 (that is, SHIFT-ZERO) to type the close-parenthesis character. Now DO NOT PRESS ENTER.

23. With the CPI numbers for 1913 all selected and the closing parenthesis typed in, press and hold down the CTRL key and the SHIFT key at the same time, and while holding them down press ENTER once. Then release all the keys.

24. If you look back at your column B, cells B4 through B15 now contain the CPI values, transposed from the original where they were all in a row.

25.

IMPORTANT NOTE: In the formula shown in this image, you can see "curly braces" -- the "{" and "}" characters -- before the equal sign and after the closing parenthesis. YOU DO NOT HAVE TO TYPE THEM IN. In fact, YOU SHOULD NOT TYPE THEM IN. The computer puts them in there automatically when you hold down CTRL and SHIFT and press ENTER. The curly braces are not "characters". They are "code". They are special information that tell you that your formula was entered the special way, and that it applies (in this case) not to a single cell but to a dozen cells.

In the Formula Bar, you can read that I am transposing cells on row 21. Specifically, I am using cells F21 through Q21. But that's just pot luck. If I inserted an extra blank row at the top, I'd be using cells F22 through Q22. And if I stuck another column at the left, I'd be starting with cell G22, not F22. The important thing is that I selected the cells containing the BLS numbers for the 12 months of 1913, and plugged them in to the 12 cells for 1913 in column B.

26. Where was I?

27. What we have just done is to capture the BLS numbers for 1913 and put them into a useful form. Before we do the numbers for 1914, there is one more thing to be done.

28. With cells B4 through B15 still selected, click in the formula bar. (Notice that the curly braces disappear! That's okay.) Now with the arrow keys (or maybe the mouse) move the cursor so that when you type, what you type will appear immediately after the open-parenthesis character and immediately before the F of F21. (Note, however, that you might not have "F21". You might be transposing numbers from a different cell. That is perfectly okay.)

29. Type SHIFT-4 to put a dollar-sign after the open-parenthesis character. Now use the arrow keys to move the cursor to the right so that the next character you type will come immediately after the ":" and immediately before the "Q21" (or whatever cellname you have). Type SHIFT-4 again to put another dollar sign immediately after the ":". NOBODY SAID TO PRESS ENTER YET. Do *not* press ENTER. Not yet.

30. The dollar sign before the "F21" makes it so the "F" will not change. The dollar sign before the "Q21" makes it so the "Q" will not change. That's what we want, because the BLS numbers we are using start in column "F" (for January) and end in column "Q" (for December). If the explanation doesn't make sense, stick with me anyway. It'll make sense when you see it happen.

31. Now that you have typed in the two dollar-signs, we are done modifying the formula. DO NOT PRESS ENTER. As before, you have to press and hold down the CTRL key and the SHIFT key. With those keys pressed, press and release the ENTER key, then release all the keys. Your curly braces should come back.

32. Now we need to do 1914. So -- with cells B4 through B15 still selected, press CTRL-C to copy the formula. Then click on cell C5 -- one row down and one column to the right of the transposed January 1913 number. With cell C5 selected, press CTRL-V to paste the formula we copied.

33. You can check your work, if you want. But if you do, make sure not to change which cells are selected.

34.

In this image, you can see that we are using CPI values from cells F22 through Q22. There are dollar signs there, so what you see is "$F22" and "$Q22". Actually, those start- and end-cells are one row down from F21 and Q21 where the 1913 numbers are. That's good. That means we got the 1914 CPI numbers.

35. Now, without changing the selection of cells, press CTRL-X to cut them out of column C. Click on the cell in column B that is next to JAN 1914 in column A, and with that one cell selected, press CTRL-V to paste the CPI values for 1914 in place where we want them.

36.

Oh, I just noticed: For the first image I used the CPIAI.ODS file I started with. But for the rest of the images I used the TEMP CPIAI.ODS file that I'm using to help me write this post. Oh well.

37. Again, do not change the selection of cells. Press CTRL-C to copy the formula that gives you the 1914 values. Point to the cell that is one row down and one row to the right of the January 1914 CPI value. (In my example, point to cell C17.) Click on that cell and press CTRL-V to paste in the formula for the 1915 values. Now (without changing the selection) press CTRL-X to cut out the formula. Click on the cell that will hold the January 1915 value, and press CTRL-V to paste the formula into all the months of 1915.

38. At this point, it has become easy to transpose the next line. (Read item 37 again if you missed it.) But there is one more consideration before you finish transposing your data.

39. In the original BLS data, there is a blank line after 1915. There is a blank line every five years, after 1915 and 1920 and 1925 and so on. So when you copy the 1915 formula to create the 1916 formula, you don't point to the cell that is one down and one to the right. You point to the cell that is two down and one to the right. By going down two, you skip the blank line.

40. And that's it. CTRL-C, CTRL-V, CTRL-X, CTRL-V. The left hand barely moves. The right hand selects cells with the mouse. It goes quick: Much quicker than writing this post, for example.

41. By the way, if you mess up, just delete the cells that are wrong, select the most recent year's numbers that are right, and get back into the loop.


It seems there is an easier way for you to get the CPI numbers all in one column: You can use the file I used.

I checked it over, as much as I could stand, and I don't see any errors in it.

The workbook contains three pages:
Source: the original BLS textfile, imported to the spreadsheet.
Monthly: all the data transposed into one column
CFYA: Change From Year Ago values, calculated from the CPI-U values

If I forgot to delete it, there is also a page named "monthly prelim" which you can ignore.

I stored the files at Google Docs. Three versions:

1. The Google Docs version.
2. The Open Office Calc spreadsheet.
3. An Excel version (created by Open Office).

Hope this was a useful post.

No comments: