Sunday, February 4, 2018

How to migrate your emails from Yahoo to Gmail (even though Yahoo will try it’s best not to let you do it)

I had been trying to move my last 2k or so email messages off my Yahoo inbox for the last 3 days. It should be a very simple process: just enable POP in the Yahoo mail donor account then setup a POP account in the target Gmail and wait for both of them do their magic.

For (not so) unknown reasons that doesn't work. With a thousand mails sitting in my Yahoo inbox, Gmail POP fetcher will just fail with the error message:

[SYS/TEMP] Server error - Please try again later.

If you search for this error you will find out you are not alone in this boat. This is actually a widely reported issue and happens with any kind of email client such as Outlook, Thunderbolt and so - and not only Gmail. Fact is, people have had this problem for a long time and they managed to fix that in a lot of different (random) ways.

For me this is clearly a problem in Yahoo side alone and I am about to believe there is absolutely nothing we can do to fix that. In the example below I am using `openssl` to directly talk to Yahoo POP3 server and run a little experiment:
$ openssl s_client -quiet -connect pop.mail.yahoo.com:995 
depth=2 C = US, O = "VeriSign, Inc.", OU = VeriSign Trust Network, OU = "(c) 2006 VeriSign, Inc. - For authorized use only", CN = VeriSign Class 3 Public Primary Certification Authority - G5
verify error:num=20:unable to get local issuer certificate
verify return:0

+OK Hello from jpop-0.1
-ERR Invalid Command.
USER mybeautifulnal
+OK Password required.
PASS mysecretpassword
+OK Maildrop ready, (JPOP server ready).
RETR 1
+OK 5760 octets.
...
RETR 500
-ERR [SYS/TEMP] Server error - Please try again later.
RETR 500
+OK 9438 octets.
...

As you can see the first time we tried to retrieve message #500, server failed with exactly the same error message Gmail has been reporting. Then we try it for a second time and it works. I have been tinkering with this issue for the past 3 days trying to find a pattern on which messages it would most likely fail or how small should I keep my inbox to minimize the risks.

Problem is I came to no conclusion and that behavior just seems completely random to me. By the other hand, to make things worse Gmail will just drop all the work done on the first error it encounters when importing email from Yahoo.

So it was clear to me what I had to try next: make my own machine collect all my Yahoo messages and then deliver them to Gmail.

Step #1: pulling email from Yahoo

Note: tools needed for this step are not available in a standard MacOS High Sierra anymore, so you might want to try a Linux distro.

We are using `fetchmail` and `procmail` for that matter and will start with the creation of ~/.fetchmailrc:
poll pop.mail.yahoo.com
protocol pop3 
uidl
user 'mybeautifulname' 
password 'mysecretpassword' 
mda '/usr/bin/procmail'
options
keep
ssl
sslcertck

We are basically telling fetchmail which server to look for, what our credentials are and what to do with the downloaded email (handle them to `procmail`). Note the `uidl` option is used to keep track of work already done so we can retry the failed messages in later runs.

We now point `procmail` to a custom mbox file within ~/.procmailrc
DEFAULT=/tmp/mail/pulled_email.mbox

We know that Yahoo will fail to deliver a few random messages each time we try a download. That's why we are running `fetchmail` for 10 consecutive times:
$ mkdir /tmp/mail
$ for i in {1..10}; do fetchmail; done

Read the mbox file and make sure the message count matches the number of messages in your webmail:
$ mailx -f /tmp/mail/pulled_email.mbox
Heirloom Mail version 12.4 7/29/08.  Type ? for help.
"/tmp/mail/origin.mbox": 1067 messages 1067 unread

If you are missing messages then run `fetchmail` again until you get enough.

Step #2: pushing email to Google

Theres a really nice tool called Got Your Back to get that work done. So we head to Github to download and install it:

$ wget https://github.com/jay0lee/got-your-back/releases/download/v1.0/gyb-1.0-linux-x86_64.tar.xz 
$ tar -xf gyb-1.0-linux-x86_64.tar.xz 
$ cd gyb
$ touch nobrowser.txt

Or if you are running MacOS:

$ curl -LO https://github.com/jay0lee/got-your-back/releases/download/v1.0/gyb-1.0-macos.tar.xz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   610    0   610    0     0    758      0 --:--:-- --:--:-- --:--:--   757
100 7492k  100 7492k    0     0  1021k      0  0:00:07  0:00:07 --:--:-- 1599k

$ tar -xf gyb-1.0-macos.tar.xz 
$ cd gyb
$ touch nobrowser.txt

Fire GYB pointing it to the place where your mbox file is located. You don't need to explicitly tell the name of your file since GYB will automatically find and load all mbox files inside the working directory.

$ ./gyb --email target@gmail.com --action restore-mbox --local-folder /tmp/mail --label-restored GYB
You might want to pick option Zero because we will need read access in case any message fails to upload.
Select the actions you wish GYB to be able to perform for target@gmail.com

[*]  0)  Gmail Backup And Restore - read/write mailbox access
[ ]  1)  Gmail Backup Only - read-only mailbox access
[ ]  2)  Gmail Restore Only - write-only mailbox access and label management
[ ]  3)  Gmail Full Access - read/write mailbox access and message purge
[ ]  4)  No Gmail Access

[*]  5)  Groups Restore - write to Google Apps Groups Archive
[*]  6)  Storage Quota - Drive app config scope used for --action quota

      7)  Continue

GYB will give you an URL to copy and paste into your browser, then you need to provide the authentication token to it. Watch the magic been done after that.
Authentication successful.

Using backup folder /tmp/mail

Restoring from 16.62MB file /tmp/mail/pulled_email.mbox...
large files may take some time to open.
restoring 10 messages (30/1067)                                                 
ERROR: 400: Bad Request. Skipping message restore, you can retry later with --fast-restore
restoring 10 messages (260/1067)                                                
ERROR: 400: Bad Request. Skipping message restore, you can retry later with --fast-restore

We had a few failed messages for whatever reason and it's a little bit hard to tell them apart. You can try the trick below:

$ ./gyb --email target@gmail.com --action backup --search "label:GYB"
$ fgrep -i 'Message-ID:' /tmp/mail/pulled_email.mbox | sort -u > all.txt
$ find . -name "*.eml" -exec fgrep -i 'Message-ID:' {} \; | sort -u | tr -d '\r' > restored.txt
$ comm -23 all.txt restored.txt

The last command will give you a list of message ids that might be missing in your restore. Take a look in your mbox file and search for them, after you identify the missing messages you can try to import them again using the regular method: head back to yahoo webmail, move all messages to any folder but the ones missing in gmail. make gmail download them thru POP3). Hopefully there are only a few handful messages missing messages for you are there were only 4 for me.