Read twitter in a one-per-line format without ever logging into the site
twitter_ebooks is a framework to make twitter bots, but it includes an ‘archive’ component to fetch historical account content which is apparently unique in that it 1) works with current TLS and 2) works the current twitter API. It stores the tweets in a JSON format which presumably matches the API return values. Usage is simple:
while read account
do
ebooks archive "${account}" "archive/${account}.json"
jq -r 'reverse | .\[\] | "\\(.created\_at|@sh)\\t \\(.text|@sh)"' "archive/${account}.json" >"archive/${account}.txt"
done
I ran into a bug with upstream incompatibilities which is easily fixed. Another caveat is that the twitter API only allows access 3200 tweets back in time for an account–all the more reason to set up archiving ASAP. Twitter’s rate-limiting is also extreme (15-180 req/15 min), and I’m worried about a problem where my naive script can’t make it through a list of more than 15 accounts even with no updates.
Edit: See here for an automatic version of the backup portion.
Connecting android to Windows and Mac, pretty easy. On arch linux? Major pain. Here’s what I did, mostly via the help of the arch wiki:
Rooted my phone. Otherwise you can’t back up major parts of the file system (including text messages and most application data) [EDIT: Actually, you can’t back these up over MTP even once you root your phone. Oops.]
Installed jmtpfs, a FUSE filesystem for mounting MTP, the new alternative to mount-as-storage on portable devices.
Enabled ‘user_allow_other’ in /etc/fuse.conf. I’m not sure if I needed to, but I did.
Plugged in the phone, and mounted the filesystem:
jmtpfs /media/android
The biggest pitfall I had was that if the phone’s screen is not unlocked at this point, mysterious failures will pop up later.
Synced the contents of the phone. For reasons I didn’t diagnose (I assume specific to FUSE), this actually fails as root:
Two forces pull at me: the desire to have few possessions and be able to travel flexibly, and the convenience of reading and referencing physical books. I discovered a third option: I have digital copies of all my books, so I can freely get rid of them at any time, or travel without inconvenience.
So that’s where we start. Here’s where I went.
I thought, if these books are just a local convenience for an online version, it’s more artistically satisfying to have some representation of that. So I printed up a card catalog of all my books, both the ones I have digital copies of and not:
That’s what a card looks like. There’s information about the book up top, and a link in the form of a QR code in the middle. The link downloads a PDF version of that book. Obviously being a programmer, the cards all all automatically generated.
For the books where I have a physical copy, I put the card in the book, and it feels like I’m touching the digital copy. My friends can pirate their own personal version of the book (saving me the sadness of lost lent-out books I’m sure we’ve all felt at times). And I just thing it looks darn neat. Some physical books I don’t have a digital version of, since the world is not yet perfect. But at least I can identify them at a glance (and consider sending them off to a service like http://1dollarscan.com/)
And then, I have a box full of all the books I *don’t* have a physical copy of, so I can browse through them, and organize them into reading lists or recommendations. It’s not nearly as cool as the ones in books, but it’s sort of nice to keep around.
And if I ever decide to get rid of a book, I can just check to make sure there’s a card inside, and move the card into the box, reassured nothing is lost, giving away a physical artifact I no longer have the ability to support.
I sadly won’t provide a link to the library since that stuff is mostly pirated.
Interesting technical problems encountered during this project (you can stop reading now if you’re not technically inclined):
Making sure each card gets printed exactly once, in the face of printer failures and updating digital collections. This was hard and took up most of my time, but it’s also insanely boring so I’ll say no more.
Command-line QR code generation, especially without generating intermediate files. I used rqrcode_png in ruby. I can now hotlink link qr.png?text=Hello%20World and see any text I want, it’s great.
Printing the cards. This is actually really difficult to automate–I generate the cards in HTML and it’s pretty difficult to print HTML, CSS, and included images. I ended up using the ‘wkhtmltoimage‘ project, which as far as I can tell, renders the image somewhere internally using webkit and screenshots it. There’s also a wkhtmltopdf available, which worked well but I couldn’t get to cooperate with index-card sized paper. Nothing else really seems to handle CSS, etc properly and as horrifying as the fundamental approach is, it’s both correct and well-executed. (They solved a number of problems with upstream patches to Qt for example, the sort of thing I love to hear)
The zbarcam software (for scanning QR codes among other digital codes) is just absolute quality work and I can’t say enough good things about it. Scanning cards back into the computer was one of the most pleasant parts of this whole project. It has an intuitive command UI using all the format options I want, and camera feedback to show it’s scanned QR codes (which it does very quickly).
Future-proofed links to pirated books–the sort of link that usually goes down. I opted to use a SHA256 hash (the mysterious numbers at the bottom which form a unique signature generated from the content of the book) and provide a small page on my website which gives you a download based on that. This is what the QR code links to. I was hoping there was some way to provide that without involving me, but I’m unaware of any service available. Alice Monday suggested just typing the SHA hash into Google, which sounded like the sort of clever idea which might work. It doesn’t.
Despite being semi-unmaintained, everything mostly works still. There were two exceptions–some major design problems around private repos. I only need to back up my public repos really, so I ‘solved’ this by issuing an Oauth token that only knows about public repos. And second, a small patch to work around a bug with User objects in the underlying Github egg:
Here’s how I added gmail to .mailrc for the BSD program mailx, provided by the s-nail package in arch.
account gmail {
set folder=imaps://example@gmail.com@imap.gmail.com
set password-example@gmail.com@imap.gmail.com="PASS"
set smtp-use-starttls
set smtp=smtp://smtp.gmail.com:587
set smtp-auth=login
set smtp-auth-user=example@gmail.com
set smtp-auth-password="PASS"
set from="John Smith <example@gmail.com>"
}
Replace PASS with your actual password, and example@gmail.com with your actual email. Read the documentation if you want to avoid plaintext passwords.
You can send mail with ‘mail -A gmail ’. If you have only one account, remove the first and last line and use ‘mail ’
Generate a Certificate Signing Request, which is sent to your authentication provider. The details here will have to match the details they have on file (for StartSSL, just the domain name).
# -subj "/C=US/ST=/L=/O=/CN=${DOMAIN}" can be omitted to fill in custom identification details
# -sha512 is the hash of your key used for identification. This was the reasonable option in Oct 2014. It isn't supported by IE6
openssl req -new -key ${DOMAIN}.key -out ${DOMAIN}.csr -subj "/C=US/ST=/L=/O=/CN=${DOMAIN}" -sha512
Submit your Certificate Signing Request to your authentication provider. Assuming the signing request details match whatever they know about you, they’ll return you a certificate. You should also make sure to grab any intermediate and root certificates here.
echo "Saved certificate" > ${DOMAIN}.crt
wget https://www.startssl.com/certs/sca.server1.crt https://www.startssl.com/certs/ca.crt # Intermediate and root certificate for StartSSL
Combine the chain of trust (key, CSR, certificate, intermediate certificates(s), root certificate) into a single file with concatenation. Leaving out the key will give you a combined certificate of trust for the key, which you may need for other applications.
I decided to post all of my purchases/income. This isn’t something I was totally comfortable with, but I couldn’t think of good reasons not to, and my default position is to release information. I think this is especially interesting since it’s not something I’ve seen made available before. Link: http://za3k.com/money.html
I think the analysis may be useful to other hackers, as people tend to be insane and cost-insensitive around money. I think having another persons’s finances to look at for comparison is something I’ve wanted for various reasons at various times, and it’s not commonly available. My selfish motivations are to get other people to tell me how I should be saving lots of money, and to feel like my financial decisions are under scrutiny (which is good and bad).