Spreadsheets for Historians


Historians’ Spreadsheet

Many scholars, even historians, use spreadsheets for tasks such as grading or accounting. Some might use programs such as Microsoft’s Excel Spreadsheets, while others use Google Spreadsheets or even Open Office. Regardless of which program we use, spreadsheets are powerful tools that allow us to collect and query information.

In most tasks for which spreadsheets are suitable, we only use one or two basic formulas or functions to manipulate the information contained within each cell. In a budget, for example, we might use the SUM function to add our expenses in one month. We might also use AVERAGE to see our average monthly expense total.

Grading and budgets are important, but historians often ask more complex questions about their data. Simple functions such as SUM and AVERAGE can be useful to an historian, but there are other functions that can save time and labour. Although collecting, manipulating, and maintaining data is important and should not be overlooked, historians can use spreadsheets and functions to shift their time balance from data creation toward data interpretation.

The following sections introduce good principles for using spreadsheets, helpful hints, and functions that historians might find useful for their research.

 

Spreadsheet Principles

  1. One Column, One Value

    • Each column in a spreadsheet should contain only one type of value.

    • For example, a column called “Name” that includes a person’s full name is not as useful as two columns called “Name_First” and “Name_Last”.

Two values in one column. Imagine trying to sort by last name.

Two values in one column. Imagine trying to sort by last name.

One value in one column

One value in one column

 

  1. Consistency is Key

    • Choose and use one format for each type of value

    • For example, do not switch back and forth between “08/12/2014” and “August 12, 2014”

  1. Combining is Easier than Separating

    • For example, creating a column called “Name_Full” that combines “Name_First” and “Name_Last” is much easier than splitting “Name” into two columns.

  1. More Data Means Less Work

    • Recording more information in more columns with greater specificity will save work later when we realize we’d like to investigate a new question.

    • For example, analyzing the location of items in different countries is more difficult when the “Location” column includes values like “Lisbon”. We might know that Lisbon is in Portugal, but the computer or software we use might not.

  1. Spreadsheets Should Be Readable

    • Most software can interpret codewords as easily as normal language, so create columns called “Name_First” rather than “NM_FST”.

    • Use common, recognizable titles in case we share our data with someone else who might not recognize “Name_Christian”

     

Helpful Hints

  1. What to do with blank cells

    • Different types of visualization, such as mapping or pie charts, can interpret blank cells in different ways.

    • We might need to analyze what we do not know.

    • The best option is to use a placeholder such as “null” when a value is missing. If we need blank cells for mapping, we can later create a duplicate column that replaces “null” with a blank cell

  1. Think carefully about our data

    • The decisions we make about our data in the early stages of collection and manipulation will shape our analysis and opportunities in later stages.

    • Spreadsheets are a combination of tools and parts. The more parts we include, the more we can accomplish with the tools.

  1. Save and copy often

    • Because functions are powerful, they can be extremely helpful or extremely destructive. Copying our files regularly will prevent serious loss when a function goes astray.

     

Useful Formulas or Functions

Basic components of a function

A Basic Function

A Basic Function

A Function Using a Range

A Function Using a Range

parts-3

A Function Using Cells and Numbers

  1. SUM(value1,value2)

    • Performs mathematical functions using cells and/or numbers

    • Examples:

      • SUM(A1,B1)

      • SUM(A1+250)

      • SUM(100+(A1/B1))

  1. AVERAGE(value1, value2) or (value_range)

    • Calculates the average of two cells or a range of cells

    • Examples:

      • AVERAGE(A1,A2)

      • AVERAGE(A1:Z1)

  1. IF(logical_test, value_if_true, value_if_false)

    • Performs a test and returns two different results based on the answer

    • Examples:

      • IF(A1=“Dog”, Woof”, “Meow”)

        • If A1 equals “Dog”, return “Woof”, otherwise return “Meow”

      • IF(B1<>“Dog”, “Not a dog”, “Woof”)

        • If B1 does not equal “Dog”, return “Not a dog”, otherwise return “Woof”

  1. COUNTIF(range,criteria)

    • Counts the number of cells in a range that meet the criteria

    • Example:

      • COUNTIF(A1:A20, “Dog”)

        • Result: number of rows with “Dog” in column A

      • COUNTIF(B1:B250, “1812”)

        • Result: number of rows with “1812” in column B

  1. CONCATENATE(text1, text2)

    • Joins together multiple cells into one string

    • Examples:

      • CONCATENATE(Name_First, “ “, Name_Last)

        • Result: “Betsy Doyle”

         

 Conclusions

The principles, hints, and functions described above are only a very brief introduction to spreadsheets for historians. Excel, Google, Open Office, and other spreadsheet programs have hundreds of functions. Not all of them will be useful to historians, but some functions can save us time and frustration. Use the lists of functions provided by the software developer to learn how they all work and how to use them:

Microsoft Excel

Google Spreadsheets

Open Office

If you can’t figure out how to use a function, search for it on Google. Test out their capabilities, experiment, and do not be afraid to fail. Finding the right function might take time, but the results will be worth the effort.

— This article has been cross-posted to the Doing Digital History Institute website.