How to: fix spreadsheet dates that are in both US and UK formats

640px-Date_format_by_country.svg

This map by Artem Karimov shows which countries use which data formats

It’s quite common when working with Google Sheets to have data set to US format (Month-Day-Year) without realising it. This is because Google will format your dates based on what ‘locale’ or language you have set – and the default is US English.

Instructions on how to change that are here – but what if it’s too late? What if you’ve already inputted or imported data which, when updated to a different format, will make it the wrong date?

Converting dates in two different formats

This is what happened to one journalist with the data shown below.

Screenshot of bad dates

In the screenshot above dates are both left- and right-aligned. This means some are being treated as text, not dates.

Because dates have been entered as UK format (DD/MM/YY) but interpreted and stored by Google Sheets as US format, it means that column has two types of dates:

  • Those entered as UK dates but stored numerically as US ones (so the 3rd of January – 03/01 – is interpreted as March 1st);
  • And those stored as text because they would not work as US dates (there is no 21st month, so 21/01/2013 is not stored as a date at all, but as a series of characters)

The numbers are right-aligned; the strings of characters are left-aligned (Google Sheets and Excel always align numbers and strings this way by default)

Converting the spreadsheet from US to UK formatting would solve the dates stored as text (’21’ would then be treated as a day not a month); but not those stored as US dates: 03/01 will still be stored as ‘March 1’.

Solving the problem part 1: using the TEXT function

The solution comes through converting all the dates to text.

And the key to that is the TEXT function.

The TEXT function needs two ingredients:

  1. what you want to convert to text;
  2. the format you want to convert it to.

The function is primary designed for converting numerical dates and amounts into particular formats by, for example, adding currency symbols, or giving the year in 4 digits while the day and month are shown as a word (e.g. “Wednesday 4 May 2016”).

The format is indicated by various characters: "mm" for example means ‘the month represented by digits’ while "dd" means ‘the day of the month represented by digits’.

Data journalism book Finding Stories with spreadsheets

In this case, however, it allows us to create a ‘fake’ date by putting numbers in a particular order. Here’s how it works:

If we want to format a date as ‘DD/MM/YY’ then a typical formula might look like this:

=TEXT(C2,"dd/mm/yy")

Remember, though, that although the dates have been entered in one format they have been stored numerically as US ones, so 03/01 is interpreted wrongly as March 1st. The formula above would result in ’01/03/13′.

Instead, then, we are going to write a formula like this:

=TEXT(C2,"mm/dd/yyyy")

This results in ’03/01/2013′ – exactly the date we wanted, even though we’ve cheated to get it.

 

Key to all of this is that the result is also stored as a string of characters, aligned left. That means it will not be affected by any change of locale settings.

What about cell C3? Well because that is already text, repeating this formula for cell C3 doesn’t really do anything: the original date is copied across without change.

=TEXT(C3,"mm/dd/yyyy")

And if you copy that formula all the way down one column (D, for example) then you will end up with a series of dates all consistently formatted as strings, in the same format.

Now you can change the spreadsheet’s overall locale – and therefore date formatting – settings to UK English or equivalent, without those dates being affected.

Converting dates as text back into dates as numbers

But you still have a problem: when formatted as text, dates cannot be properly sorted (the day of the month determines the order primarily, regardless of month or year), and you cannot easily find out what day of the week a date was, or calculate the time since a date.

To convert the date-as-text back into a proper date stored by Google Sheets as a number, you will need to use the DATE function.

The DATE function performs the reverse action: it turns text into a date-as-number.

To do this it needs 3 ingredients: the year, the month, and the day.

You can extract these separately from your new date-as-text column using the RIGHT, MID and LEFT functions respectively: these help you grab characters from the right of a cell, the middle of a cell, or the left of a cell. For example this formula:

=RIGHT(D2,4)

…will start from the far right of cell D2 and grab 4 characters. In other words, it will grab the last 4 characters in that cell. This formula:

=LEFT(D2,2)

…will grab the first two characters from the left of cell D2. And this formula:

=MID(D2, 4, 2)

…will go to cell D2, start from the 4th character in that cell, and grab 2 characters from that point.

You could use those 3 formulae to create new columns for ‘Year’, ‘Day’ and ‘Month’.

If those columns were E, F and G you can then combine them using the DATE function like so:

=DATE(E2, G2, F2)

Alternatively, if you were feeling more ambitious you could instead skip the 3 new columns and combine all those formulae into one like this:

=DATE(RIGHT(D2, 4), MID(D2, 4, 2), LEFT(D2, 2))

That may be impressive, but it’s useful to have columns for year, month and day anyway so you may as well do it the long and simple way.

Whichever way you do it, you should now have a new column with dates aligning right which means they are stored as numbers and can be properly sorted, used in calculations, or converted into weekdays or other parts.

Data journalism MA

Watch out for my new course, starting in 2017

Advertisements

9 thoughts on “How to: fix spreadsheet dates that are in both US and UK formats

  1. Pingback: 4 examples of computational thinking in journalism | Online Journalism Blog

  2. Doc

    How do you add the date-as-text column without impacting those stored as dates already. For example, if you have a large number of dates, and some are text and some are date format. Using the last few steps here fouls up those already in date format.

    Reply
    1. Paul Bradshaw Post author

      Those stored as dates already are converted to a text string. Those stored as text remain as a text string. So you end up with a column where *all* the dates are a text string, not a mixture of text and date-as-number. You just need to make sure you use “mm/dd/yyyy” or an equivalent which is the same as the way the text strings are formatted.

      The next step is then to convert those to a proper date – which works because they’re no longer in mixed formats and they all use the same pattern (mm/dd/yyyy for example).

      The last few steps shouldn’t foul up anything – unless you’ve chosen a different format to the way the text strings are organised.

      Reply
  3. georginahearth

    Hi,

    My ‘UK’ dates are stored as general, not as text. So when completing the first step =text(A2,”mm/dd/yyyy”) these also reverse becoming the US format. Any clues on how to fix this?

    Cheers!

    Reply
    1. Paul Bradshaw Post author

      As long as you can get them all in the same format in that column (mm/dd or dd/mm), then it doesn’t matter. Once you move to the next step you are making sure that they are all treated the same, and can change the formatting.

      Reply
  4. maggy

    I think I may have found a simpler method by accident, hopefully it will work for everyone
    Im in UK region and need dd/mm/yy dates but my Google Sheet has dates in US format mm/dd/yy and appearing like your example above with different alignments

    1. I open the spreadsheet and go to File > Spreadsheet Settings and change the locale to US

    2. I select my column of dates and go to Data > Split Text To Columns with the Separator set as ‘Detect Automatically’ and this converts all the dates to text and leaves them in the same column in US mm/dd/yy format, all aligned the same. (I didn’t expect this behaviour but it worked)

    3. I switch back my locale to UK,

    4. I select my column of dates again and Format > Number > dd-mm-yy and the date is presented correctly for me in the UK format I want

    Reply
  5. Rob Farnell

    I was having all sorts of pain with a CSV generated by a system that created US date formats with timestamps. None of the suggestions above were giving me the answer I wanted, it just didn’t want to interpret any cell that was higher than 12 for what it thought was the month. I was able to resolve it with the Power Query Editor rather than just opening the CSV in Excel and selecting the date/time formats. There is an option on the column selection to choose locale and selecting US Date and Time. This worked immediately and didn’t need any of the finicky massaging of extra columns etc.
    Hope this helps someone else in the future.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s