Categories: sumana | Programming
Software development, system administration, and similar
# 27 Jun 2017, 02:52PM: Learning To Get Around In Java:
One of Changeset Consulting's clients is working on modernizing a legacy web application; we're improving both its structural underpinnings and its user interface and outgoing APIs. It is like we are Chief O'Brien in the first season of Star Trek: Deep Space Nine, surveying and retrofitting Terok Nor. But that's not a fair comparison; O'Brien has to not only grapple with alien engineering approaches, but with the resentful and deliberate trashing the Cardassians inflicted on the station before handing it over. I haven't seen Stargate Atlantis but perhaps that's a better analogy; with every component of this long-asleep lost city that we resuscitate, a new console or room shimmers to life. Which is pretty rewarding!
The original authors wrote this application in Java. I've never worked on a Java application before, so the last few weeks have been quite an education in the Java ecosystem, in its tools and frameworks and libraries. We're improving the installation and deployment process, so now I'm more familiar with Ant, Gradle, Maven, OpenJDK, JDBC, Hibernate, and WildFly. I've gotten some API documentation in place, so now I know more about Spring and Javadoc.
As I was explaining to a friend this weekend, the overwhelming thing isn't Java as a language. It is a programming language and you can program in it, fine. The overwhelmption is the seemingly endless chain of plugins, platforms, and frameworks, and the mental work to understand what competes with, supersedes, integrates with, or depends on what.
Imagine you come to visit New York City for the first time, and wish to visit a specific address. First you need to work out where it is. But you do not have a map; there is no unified map of the whole place. Surely you can figure this out. Watch out: if someone doesn't tell you what borough an address is in, it's probably in Manhattan, but then again maybe not. There are multiple streets with the same name, and "31st Street and Broadway" in Queens is quite far from "31st Street and Broadway" in Manhattan. The avenue numbers go up westwards in Manhattan, eastwards in Brooklyn, and northwards in Queens. And so on.
You ask around, you see sketches of maps other people have made on their journeys, and eventually you feel pretty confident that you know the rough distance and direction to your destination. Now, how do you get there from your hotel room?
You probably don't want to walk all the way; for one thing, it's illegal and dangerous to walk on the freeways. This is why we have the subway (express and local), and buses (express and local, both privately and publicly run), and government-regulated taxis (street-hailable cabs and private car services), and bike rental, and commuter rail, a funicular/tram, car rental, ferries, and so on. Also there are illegal rideshare/taxi services that lots of people use. You try to learn some nouns and figure out what sort of thing each is, and what's a subset of what.
A MetroCard works on some of these modes and not others, and some transfers from one ride to another cost you nothing, and you can't use an unlimited-ride card twice at the same station or on the same bus within 18 minutes.* You can bring a bike on some MTA-run services but not all, not all the time. There are whole neighborhoods with no subway service, and whole neighborhoods with approximately no street parking. At rush hour the trains get super full. Service changes at night, on the weekend, and on holidays. Cars and buses get stuck behind accidents and parades. People and signs in Manhattan refer to "uptown" and "downtown" as though they are cardinal directions; they often correlate to "north" and "south" but not always. Metro North trains terminate at Grand Central, but Long Island Railroad trains terminate at New York Penn Station, which is named after Pennsylvania because it's where you can catch a train to Pennsylvania,** and there's a Newark Penn Station too but over a crackling loudspeaker those two station names sound very similar so watch out. And so on.
You're lucky; you find a set of cryptic directions, from your hotel to the destination address, based on a five-year-old transit schedule. It suggests you take a bus that does not exist anymore. Sometimes you see descriptions of travel that you think could be feasible as a leg of your journey, and you read what other people have done. They talk about "Penn Station" and "the train" without disambiguating, refer to the subway as "the MTA" even though the MTA also runs other transit, talk about "the 7" without distinguishing local from express, and use "blocks" as a measure of distance even though some blocks are ten times as long as others.
Aaaaagh. And yet: you will make it. You will figure it out. New Yorkers will help you along the way.
The decades-old Java ecosystem feels overwhelming but this application overhaul is like any other task. Things are made of stuff. Human programmers made this thing and human programmers can understand and manipulate it. I'm a human programmer. I made Javadoc do what I wanted it to do, and now the product is better and our users will have more information. And every triumph earns me a skill I can deploy for other customers and groups I care about.
* Just long enough for you to enjoy a little break from the podcast you're making with President Nixon!
** Also see St. Petersburg's Finland Station.
# 15 Oct 2016, 01:55PM: New Zine "Playing With Python: Two of My Favorite Lenses":
MergeSort, the feminist maker meetup I co-organize, had a table at Maker Faire earlier this month. Last year we'd given away (and taught people how to cut and fold) a few of my zines, and people enjoyed that. A week before Maker Faire this year, I was attempting to nap when I was struck with the conviction that I ought to make a Python zine to give out this year.
So I did! Below is Playing with Python: 2 of my favorite lenses. (As you can see from the photos of the drafting process, I thought about mentioning pdb, various cool libraries, and other great parts of the Python ecology, but narrowed my focus to bpython and python -i.)
Playing with Python
2 of my favorite lenses
[magnifying glass and eyeglass icons]
by Sumana Harihareswara
When I'm getting a Python program running for the 1st time, playing around & lightly sketching or prototyping to figure out what I want to do, I [heart]:
bpython & python -i
[illustrations: sketch of a house, outline of a house in dots]
bpython is an exploratory Python interpreter. It shows what you can do with an object:
>>> dogs = ["Fido", "Toto"]
append count extend index insert pop remove reverse sort
And, you can use Control-R to undo!
[illustrations: bpython logo, pointer to cursor after dogs.]
Use the -i flag when running a script, and when it finishes or crashes, you'll get an interactive Python session so you can inspect the state of your program at that moment!
$ python -i example.py
Traceback (most recent call last):
File "example.py", line 5, in
toprint = varname + "entries"
TypeError: unsupported operand type(s) for + : 'int' and 'str'
[illustration: pointer to type(varname) asking, "wanna make a guess?"]
More: "A Few Python Tips"
This zine made in honor of
NYC's feminist makerspace!
CC BY-SA 2016 Sumana Harihareswara
Everyone has something to teach;
everyone has something to learn.
Here's the directory that contains those thumbnails, plus a PDF to print out and turn into an eight-page booklet with one center cut and a bit of folding. That directory also contains a screenshot of the bpython logo with a grid overlaid, in case you ever want to hand-draw it. Hand-drawing the bpython logo was the hardest thing about making this zine (beating "fitting a sample error message into the width allotted" by a narrow margin).
Libby Horacek and Anne DeCusatis not only volunteered at the MergeSort table -- they also created zines right there and then! (Libby, Anne.) The software zine heritage of The Whole Earth Software Review, 2600, BubbleSort, Julia Evans, The Recompiler, et alia continues!
(I know about bpython and python -i because I learned about them at the Recurse Center. Want to become a better programmer? Join the Recurse Center!)
# 26 Sep 2016, 09:33AM: iCalendar Munging with Python 3, Requests, ics.py, and Beautiful Soup:
Leonard and I love seeing movies at the Museum of the Moving Image. Every few months we look at the calendar of upcoming films and decide what we'd possibly like to see together, and put it on our shared calendar so we remember. And for every showing (example) the MoMI provides an iCalendar (.ics) file, to help you add it to your electronic calendar. But it's a pain to individually download or refer to each event's .ics file and import it into my electronic calendar -- and the museum's .ics files' DTEND times are often misleading and imply that the event has a duration of 0 seconds. (I've asked them to fix it, and some of their calendar files have correct durations, but some still have DTEND at the same time as DTSTART.)
Saturday morning I had started individually messing with 30+ events, because the MoMI is doing a complete retrospective of Krzysztof Kieslowski's films and I am inwardly bouncing up and down with joyous anticipation about seeing Dekalog again. And then I thought: I bet I can automate some of this tedious labor!
So I did. The create-fixed-ics.py script (Python 3) takes a plain text file of URLs separated by newlines (see movie-urls-sample-file.txt for an example), downloads iCalendar files from the MoMI site, fixes their event end times, and creates a new unified .ics file ready for import into a calendar. Perhaps the messiest bit is how I use a set of regular expressions, and my observations of the customs of MoMI curators, to figure out the probable duration of the event.
Much thanks to the programming ecology that helped me build this, especially the people who made RegExr, Beautiful Soup (hi Leonard), Requests, ics.py, and the bpython interpreter, and the many who have written excellent documentation on Python's standard library. Thanks also to Christine Spang, whose "Email as Distributed Protocol Transport: How Meeting Invites Work and Ideas for the Future" talk at Open Source Bridge 2015 (video) introduced me to hacking with the iCalendar format.
- It can be a bit slow as the number of URLs adds up -- it took maybe 5 minutes to process about 31 events. I oughta profile it and speed it up. But I usually only need to do this about six times a year.
- This script is not careful, and will overwrite a previously created .ics file at the same address (in case you're running it twice in one day). It has no tests and approximately no error-checking. This was a scratch-my-own-itch, few-hours-on-a-Saturday project. No Maintenance Intended.
- Absolutely not an official project of the Museum of the Moving Image.
# (1) 27 Jul 2015, 02:00PM PST: Slides & Code from HTTP Can Do That?!:
My slides are up, as is demonstration code, from "HTTP Can Do That?!", my talk at Open Source Bridge last month. I am pleased to report that something like a hundred people crowded into the room to view that talk and that I've received lots of positive feedback about it. Thanks for help in preparing that talk, or inspiring it, to Leonard Richardson, Greg Hendershott, Zack Weinberg, the Recurse Center, Clay Hallock, Paul Tagliamonte, Julia Evans, Allison Kaptur, Amy Hanlon, and
Video is not yet up. Once the video recording is available, I'll probably get it transcribed and posted on the OSBridge session notes wiki page.
I've also taken this opportunity to update my talks and presentations page -- for instance, I've belatedly posted some rough facilitator's notes that I made when leading an Ada Initiative-created impostor syndrome training at AdaCamp Bangalore last year.
# 18 Jun 2015, 06:53AM: HTTP Can Do That?! and Comedy:
On Wednesday of next week (June 24th) I'm presenting "HTTP Can Do That?!" at Open Source Bridge in Portland, Oregon.
I have explored weird corners of HTTP -- malformed requests that try to trick a site admin into clicking spam links in 404 logs, an API that responds to POST but not GET, and more. In this talk I'll walk you through those (using Python, netcat, and other tools you might have lying around the house).
I practiced this talk Tuesday night at the Recurse Center and it went well; people learned a lot about headers, verbs, status codes, and odd HTTP loopholes, and gave me constructive criticism so next week's version will be clearer.
I have also suggested a Birds of a Feather evening session called "Nothing Is Totally Incomprehensible If We Try Together" but don't yet know whether or when it will happen.
Then, at AlterConf Portland on Saturday, June 27th, I'll be performing some stand-up comedy for hippie nerds. I thought about trying to cram 100 punchlines into my 45-minute HTTP talk, but I don't think I'll be able to achieve that -- people need to understand something before they can understand a joke about it -- so it'll be nice to get 4 or 5 laughs per minute during the stand-up on Saturday.
# (1) 24 Apr 2015, 03:26PM: Technothriller Book Review Partially In The Form Of A Python Exercise:
I am glad I read Hackster: The Revolution Begins..., a technothriller by Sankalp Kohli and Paritosh Yadav taking place in modern-day India. It's plotty and passionate and tense, and it's about Indians to whom India is the center of the universe. But it's also got major problems. Here are some quotes:
It was now time to attain answers. And he had found his answers in SNAGROM -- a device conceptualized by his father, but built and made operational by him with a few modifications to avenge the death of his patriotic father who had sacrificed his whole life for the progress of beloved country, India, only to be publicly humiliated and pronounced a terrorist with links to Pakistan's ISI by the ruling party of India, The Democratic Alliance Party. [p. 23]
Mr. Bedi, Vikram's father, was a scientist. He had the unique ability to solve problems by using concepts of one domain, into an altogether different one - something which most academicians couldn't do. His papers and theories on early meta-systems had brought a fresh perspective and direction into the scientific community. In his papers, he reduced the bigger problems into simple ones. He put it very simply, a meta-system is a system based on other systems. [p. 35]
Arjun could feel this guy getting to him.... he was not a person who took even the smaller defeats sportingly. For him defeat was accompanied by a splurge of vengeance. [p. 68]
"It seems like he had conceptualized a system that replicated the modern day concept of Big Data trackers and used it to come out with trends which were closer to reality." Vikram whispered to himself. [p. 78]
But, was it all because of one man? How could a single man cause so much havoc? It must have been 'the system'. [p. 111]
For ten years, he had used his peculiar ability to suppress all sorts of mutiny within the alliance with an ease that always surprised everyone around him. Nobody had ever seen him running across the country to meet the influential people in times of crisis. He would simply make a private phone call and follow up the next day. The matter would be resolved. [p. 152]
So I didn't love the prose or the characterization. And one plot thread in Hackster disproportionately bothered me.
In the scene below, two guys are investigating a break-in by Vikram, a super-elite hacker. Vikram broke into the Srinagar police department's "criminal database" to remove his friend Ashfaq's name from "the list of arms dealer with a pending investigation" (sic). Initially, police investigators had overlooked the incursion: "They termed it a routine hack failure." [p. 17-18] But this new anti-cybercrime unit digs deeper. For context, both authors of Hackster have MBAs, one "in the field of telecom technology," and in the Acknowledgement they thank someone for cybersecurity advice.
"He deleted one entry and then used a jumbler on all the others."
"After deleting the entry, he covered his track by jumbling up the names of all the people in the list. I tried running a point to point match between the shuffled copy of this list with an older correct copy, but none of the names matched. In short the whole list is corrupted, and we will not be able to make anything out of it easily. It is a long list. It has too many names. This guy is a genius." [p. 51-52]
But then Aarti, a top-shelf cybersecurity expert, succeeds at extracting the name "Ashfaq Ahmed Karim":
"He didn't know that entire data of servers of police department gets automatically stored in tape drives at the end of each month. These tape drives are detached from the servers and are stored in a secret location. I took out an older version of Illegal Arms Dealer List from the backup tape drives and then wrote a program to match each word of the older list with the newer one and rearranged the new list accordingly."
Sumit and Rao watched her with awe as she continued further, "Even the most advanced computer of ours took two days to complete this activity and give us this one name. This one lead should help us to take a step closer to our target." [p. 82]
My suspension of disbelief at this point broke so hard that it sent shards into nearby brick walls, where they remain, softly vibrating. I'm willing to set aside, for the sake of fiction, how badly guarded this data is, and why does Aarti have to go to the tape drive if there's an older version of the list more readily available, and why are they acting like this is a giant string rather than a set of rows in a table in a relational database and thus amenable to additional forensic techniques. Even so: this kind of puzzle is practically a junior programmer's intro-to-Python exercise. You could do this in bash; you could do it in Excel. And unless the Srinagar police department is tracking pending investigation against literally millions of arms dealers, a bog-standard developer's laptop could run that script in, mmm, 20 minutes.
Hmmmmmmmm, how long would it actually take? I decided to try to replicate this, without even trying very hard and while listening to a Taylor Swift album on repeat. I took the 417 names from the Nielsen Haydens' old blogroll, put them into a file separated by newlines (bloggers-archive.txt), and then removed one name, and saved the new file as bloggers.txt. Ah but now I want to obfuscate it! So I pulled all the names apart into their component words and shuffled them randomly and then wrote that back to a file (code: obfuscate.py). The new, jumbled list looks suitably forbidding:
My findmissingname.py script does not bother to "rearrange the new list accordingly" because what Aarti really wants is the missing name. findmissingname.py spits out the two words in the missing name, and it takes 0.04 seconds to do so on a ThinkPad. And I'm bone certain I could optimize performance further.
This points to an asymmetry I had not previously noticed regarding what will and will not break my suspension of disbelief. When I'm reading scifi or technothrillers, I am reasonably fine with magic zoom-enhance, encryption, robotics, and other implausible advances. I can deal with it if you have way cooler toys than exist in my world, if you tell me something hard for me is easy for you. But if you try to tell me that something easy for intermediate-skilled me is hard for hella competent world-class experts with best-of-breed gadgets, I laugh, because you're ridiculous.
I am married to a programmer whose code has literally been used to catch an illegal arms dealer. I highly doubt this repository is going to have a similar impact. But hey, I learned something new about my genre reading conventions and I practiced my Python 3.
# 19 Jan 2014, 04:51PM: Interesting-Looking Talks at PyCon 2014:
This year I'm going to visit PyCon! In fact, I'm presenting a poster: "What Hacker School Taught Me About Community Mentoring". You should register soon if you're coming, especially to take advantage of heavily subsidized childcare or to register for one of the tutorials.
Someone on one of my mailing lists asked what sessions people are particularly looking forward to. I tend to follow Skud's conference tips, which mean skipping sessions when I need to do self-care. But with such great-sounding talks, I may not be able to pull myself away!
- Allison Kaptur's "Import-ant decisions". Kaptur is a facilitator at Hacker School and I enjoy her thinking
process, areas of interest, and speaking style. I know I'll learn more about package management in general, and about Python specifically, from this talk.
- Jessica McKellar's "Building & breaking a Python Sandbox". McKellar did a residency in my Hacker School batch during which I got to see a preview of this talk, so I may not go again, but I found it thought-provoking; it helped me understand how Python works in a new way.
- Erik Rose's "Designing Poetic APIs". I met Erik Rose at a Google Summer of Code Mentor Summit and thought he was a really smart and fun guy. I've never designed an API before (thus I also want to go to the novice-focused "So You Want to Build an API?" by Megan Speir), and I like that his discussion will include anthroplogy, psychology, and history.
- Nathan Yergler's "In Depth PDB". I am one of the programmers the speaker sadly mentions; I rarely use it, if at all, even though it would be great to help me see when and why things are breaking!
- A localisation talk, maybe this one, because I think the Python application I'm working with right now just has hardcoded strings (aiee, bad).
- One of the SQLAlchemy talks, maybe this one, because I don't grok how to use SQLAlchemy yet. However I have registered for the SQLAlchemy in-depth tutorial so one may duplicate the other.
- Julia Evans's "Diving into Open Data with IPython Notebook & Pandas".
I enjoy Julia Evans's investigation process and her speaking and writing style. (See my post "Why Julia Evans's Blog Is So Great".) I have never used IPython Notebook nor matplotlib, numpy, pandas, or any of the other awesome science/data-related Python tools, and keep meaning to; this talk should help me with that.
- Greg Wilson's "Software Carpetry: Lessons Learned".
I am a tremendous fan of Wilson's work - Software Carpentry, the books he's edited, etc. The SC crowd has collected a lot of data (e.g. surveys of learners at their bootcamps) and I will probably want to soak in their lessons learned and shout about them to every other teach-y group I'm in.
- Naomi Ceder's "Farewell and Welcome Home: Python in Two Genders". I want to learn more about the experiences of trans women in my open source communities.
- Kate Heddleston, Nicole Zuckerman, presenting "Technical on-boarding, training, and mentoring". I do this task in my open source communities so I want to learn more best practices.
I'm thoroughly looking forward to my first PyCon. (I stopped by one for like an hour in 2003 and helped at the registration desk; I guess it took me eleven years to get to the other side of the desk!)
# (2) 22 Dec 2013, 10:42AM: Why Julia Evans's Blog Is So Great:
Some writing is persuasive; it aims to cause you to believe or do something. Some is expository; it aims to cause you to understand something. A lot of tech writing is persuasive or expository.
Some writing is narrative. It aims to cause you to feel or experience something. In personal narrative, the writer shares a personal experience and invites you to walk with her on that journey, experiencing it as she did, emerging with a new perspective. I really like narrative-style tech writing.
What I call the "Amazing Grace" story (previously) is, in a sense, all three of these. "Amazing grace! (how sweet the sound) / That sav'd a wretch like me! / I once was lost, but now am found, / Was blind, but now I see." Or, in more modern terms, "An English Sailor Found Salvation Through This One Weird Trick."
- Exposition: My experience started in sordid terror and ended in divine ecstasy
- Narration: Bask and wonder with me in the intricacy of my journey and the unexpected yet inevitable emergent properties of my condition
- Persuasion: Thus, if you are enthralled to sin, if you are a fallen resident of our fallen world, you should follow my example
I started thinking about this because my Hacker School colleague Julia Evans has a super-engaging blog. During our batch, she dove into operating system internals, and blogged about what she learned and how she learned it. She's consistently inspired me and made me laugh. Two of her fans (fellow HSers) even made a loving Markov-chain tribute, Ulia Ea.
One reason we love it is that most entries narrate her daily learning and illustrate a journey through confusion into wonder. See "Day 37: After 5 days, my OS doesn't crash when I press a key", which is possibly the most "Amazing Grace"-esque of her posts. Excerpt:
5. Press keys. Nothing happens. Hours pass. Realize interrupts are turned off and I need to turn them on....
It's not just the large-scale rhetorical structure; her diction and even her punctuation delight me. I particularly marvelled at her sentences in "Day 43: SOMETHING IS ERASING MY PROGRAM WHILE IT’S RUNNING (oh wait oops)". Excerpt:
12. THE OS IS STILL CRASHING WHEN I PRESS A KEY. This continues for 2 days....
As far as I can tell this is all totally normal and just how OS programming is. Or something. Hopefully by the end of the week I will get past "I can only receive one IRQ" and into "My interrupt handler is the bomb and I can totally write a keyboard driver now"....
I'm seriously amazed that operating systems exist and are available for free.
SURPRISE MY CODE IS NOT WORKING BECAUSE SOMETHING IS ERASING IT.
Can we talk about this?
- I have code
- I can compile my code
- Half of my binary gets overwritten with 0s at runtime. Why. What did I do to deserve this?
- No wonder the order I put the binary in matters.
It is a wonder that this code even runs, man. Man.
The disarmingly informal ALLCAPS adds to the intimacy more explicitly created with the question "Can we talk about this?" which invites the reader into one-on-one conversation. Moreover, I specifically call your attention to the statement "Why." and the repetition "man. Man." They demonstrate how Julia acknowledges mystery, with a tinge of disbelief.
As Patrick Nielsen Hayden observed,
A great deal of science fiction is about what the field's insiders often call "sense of wonder," a quality not entirely unrelated to the good old Romantic Sublime. Many of the genre's classics are in essence carefully-tuned machines designed to attract readers whose primary conscious loyalty is to rationalism, and lead them by a series of plausible contrivances to a sudden crescendo of mystical awe. This is an important part of SF from Olaf Stapledon to William Gibson and beyond.
And Julia Evans.
This work by Sumana Harihareswara is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Permissions beyond the scope of this license may be available by emailing the author at firstname.lastname@example.org.