Categories: sumana | Recurse Center
My experiences at and with the Recurse Center [recurse.com], formerly Hacker School
# 16 Jul 2020, 10:41AM: Misunderstanding What It Takes To Make Recurse Center's Social Rules Work:
In a contentious thread on MetaFilter about race and racism on the site, one user lauds the bit of Recurse Center's manual about how if you are corrected for breaking one of the four Social Rules, you should just accept it, apologize, reflect a bit, and move on. The user, praising the way a particular user took a correction well, says, "I want to move to a world where this kind of confrontation can be so normalized and painless." Yes! I'd like that too! I'd also like for more people and groups to make progress on following the four Social Rules themselves: No feigning surprise, no well-actuallys, no back-seat driving, and no subtle -isms.
I'm not going to say this in the MetaFilter thread because it would derail things, but I'll blog it here. I have participated twice at Recurse Center, my first batch being in late 2013, and I've participated in MetaFilter for over ten years. The significant differences between Recurse Center and MetaFilter are similar to the differences between RC and Wikimedia, which I briefly discuss in "Hospitality, Jerks, and What I Learned" (a keynote about RC that I gave a few years ago at a Wikimedia conference, back when RC was known as Hacker School). And these differences would make it a lot harder for a website like MetaFilter to take this particular aspect of the Social Rules and make it stick. Maybe even impossible.
In this post, I am trying to be descriptive about what works, and what makes good outcomes more likely, not to prescribe what individual people or institutions should do.
Key logistical differences
Every Recurser has dedicated specific time to participating in a shared cause, there are gatekeepers who only let some people in in the first place (interviewing participants to check whether they are pleasant people to collaborate with), even when we're online most conversation is in a private space that only other Recursers can see, and there are under ~2000 Recursers, and (I believe) most Recursers trust RC staff to handle incidents well in case we need to escalate a report to them. And the vast majority of Recursers started after the Social Rules, in their current form or close to it, have been set as policy. And, till this year, every Recurser also had substantial experience of in-person interactions with other Recursers, including (at least since 2013, when I first participated in RC) an orientation session going into the social rules and modelling a correction and someone accepting that correction.* (RC is now remote and online, till at least the end of 2020. The orientation session is now via a videocall.) All of these elements help Recursers trust and understand each other, lower their defensiveness about being corrected, and correct each other with less worry that the other person will respond badly.
Also, pile-on/snarky responses to subtle -isms (such as racism, sexism, and homophobia) are not allowed, and Recursers are asked to only talk about distressing politics and -isms in spaces where every conversational participant has explicitly opted in to talking about that (to avoid distracting people from learning about programming). To quote the manual:
If you see a subtle -ism at the Recurse Center, you can point it out to the relevant person, either publicly or privately, or you can ask one of the faculty to say something. After this, we ask that all further discussion move off of public channels. If you are a third party, and you don't see what could be biased about the comment that was made, feel free to talk to faculty. Please don't say, "Comment X wasn't homophobic!" Similarly, please don't pile on to someone who made a mistake. The "subtle" in "subtle -isms" means that it's probably not obvious to everyone right away what was wrong with the comment.
We want the Recurse Center to be a space with as little bigotry as possible in it. Therefore, if you see sexism, racism, etc. outside of the Recurse Center, please don't bring it in. So, for example, please don't start a discussion of the latest offensive comment from Random Tech Person Y. For many people, especially those who may have spent time in unpleasant environments, these conversations can be very distracting. At the Recurse Center, we want to remove as many distractions as possible so everyone can focus on programming. There are many places in the world to discuss and debate these issues, but there are precious few where people can avoid them. We want the Recurse Center to be one of those places.
The "please don't pile on" guidance helps reduce one's worry that publicly correcting someone will cause an unpleasant pile-on, and helps a participant who's being corrected avoid the urge to get defensive. And discouraging discussions of -isms in common-area spaces that Recursers are likely to participate in by default (the "living room" and its online equivalent) means that everyone who has explicitly opted in to having such a discussion is likely to be more thoughtful and careful about participating in it. (For more on the opt-in suggestion -- and on the fact that, yeah, this means allies may learn less about fighting bigotry, because we don't talk about bias as much -- see Allison Kaptur's blog post "Subtle -isms at Hacker School".)
Of course, "accept corrections gracefully and move on" is generally good advice for an individual to try to adopt in one's own behavior in general!** I try to do this myself! But it really helps to get practice someplace where everyone is trying to do that, and RC helps everyone practice that skill. I was a lot better at correcting and being corrected on Day 90 at RC than I was on Day 1. And the particular systematic expectation of that behavior at RC depends on a sort of pact that everyone signs up to -- that today I'll correct you straightforwardly and nonchalantly about feigned surprise, and you'll accept that correction quickly and with good grace, and then tomorrow maybe our positions will be reversed, and you'll correct me politely about a subtle -ism and I'll take it well. That we will both be open and vulnerable to correction, and will both try not to get defensive. The structural forces I have listed earlier in this post massively help everyone trust in that pact. And there's a Recurse Center Code of Conduct introduced in mid-2017, which we know we can escalate to in case someone's behavior has been really egregious.
If you want to systematically make "give and accept corrections in this way" the norm on a message board where anyone can register for USD$5, there's no unifying purpose to why people join, everything we write there is public on the web forever, most participants never meet most of the other participants in person, there are tens of thousands of registered users who generally discuss ALL topics especially including -isms, snarky/dismissive responses to others' comments often are allowed, and the board itself is 20+ years old, that's going to be very difficult. There are significant structural barriers here.
It's also worth noting that the particular conversation that spurred this comparison is about reducing racist speech on MetaFilter. Many of the more vocal anti-racism activists on MetaFilter do not particularly trust most white users or the site's owner and mostly-white moderation staff, and are strenuously against civility norms when it comes to correcting racist statements. These are also significant differences between MeFi and RC, based on my experience in both.
"Community" and purpose
So far, in this post, I've said "message board", "website", "group", "users", "participants"; you may have noticed that I have not yet used the word "community." I am trying to be careful about how I use that word, because I think it subsumes some important assumptions.*** RC cofounder Nick Bergson-Shilcock wrote, "Having a genuine community requires that people know the other people around them, and that everyone shares some fundamental values and purpose." I agree. (I'd also say that a genuine community also has to have some kind of systematic way for the membership as a whole to affect/veto decisions that will affect them, which is a place Recurse Center falls down, being a privately owned for-profit enterprise that has no advisory board or other structurally empowered voice for Recursers in RC governance. MetaFilter is on its way to starting a user advisory board to specifically listen to the Black, Indigenous, and People of Color concerns.)
And my current assessment is that MetaFilter does not qualify, because most MeFites don't know most other MeFites, and because I think MetaFilter's shared fundamental purpose is ... thin in a way that bears more explaining but gets very wiggly.
The official MetaFilter guidelines say: "The fundamental goal of MetaFilter is for this to be a good, kind, generous, inclusive, and fun community on the internet." And the "About" page says:
Metafilter is a weblog ... that anyone can contribute a link or a comment to. A typical weblog is one person posting their thoughts on the unique things they find on the web. This website exists to break down the barriers between people, to extend a weblog beyond just one person, and to foster discussion among its members.
I would bet that a survey of a hundred MeFites chosen at random, about why they participate and the purpose of the site, would show several disparate clusters of answers. General entertainment and information, shooting the breeze about whatever comes up, fun (but not in particular to create fun for others), fun (to create for/with others), asking for and giving advice about specific problems and opportunities, getting publicity for stuff we've made... I'd be curious what people would say about why they participate, what they think MeFi's purpose is, and how much it would align with the goal/purpose statements above.
In contrast, RC (from the About page) "offers educational retreats for anyone who wants to get dramatically better at programming. The retreats are free, self-directed, and project based." Which is pretty clear. Everyone at RC has signed up for the purpose of "becoming a dramatically better programmer" and I would bet that, if you interviewed a hundred Recursers chosen at random, they'd all say something very similar about the reason they participated/participate and about RC's purpose.
Knowing each other, and sharing fundamental values and purpose, is important here because it's the soil from which interpersonal and group trust can sprout.
And you need trust in order to be vulnerable, including the vulnerability of speaking up about something that's wrong, and the vulnerability of accepting a correction without defensiveness.
Argh this is already so long, and I haven't even talked about how "no feigned surprise" and "no well-actuallys" and "no backseat driving" all play into the trust and vulnerability, how bad-faith actors can potentially weaponize "no feigned surprise," whether MetaFilter even wants to be a nurturing learning environment (to the extent that such a varied group can be said to "want" something), the ideological position that it is never an oppressed person's responsibility to make any effort to help create an environment that helps people learn facts or skills relevant to fighting that oppression, what general lessons I would draw from this for your Internet-based group of choice, and and and.
But: I hope you get my point. The Social Rules are great. And/but they work at RC partly in concert with the shape, type, configuration of the place and its membership. And that shape is very different from MetaFilter's.
* Recurser Nat Quayle Nelson wrote a play called "Survival Instinct" (link is to a recording of a live reading) that includes a fictional portrayal of Recurse Center. Listen to the conversation starting around 11:00 to hear a Recurser beginning to learn the "no -isms" rule.
** But I do not prescribe this as a general rule for everyone reading this, because you know what? Sometimes you should push back, though realistically this is incredibly unlikely if you're in a dominant group regarding the -ism that you got criticized about. I am not the arbiter of "the social justice rules of engagement" and that is a whole other essay that I may or may not ever write or publish.
*** I recently ran into David Gurteen's definition, "A community is a group of people who share things in common, who work together towards a common purpose which they care about and who care deeply about each other." I am not ready to buy into Gurteen's thinking, given that Gurteen believes it is not possible to have a real conversation in text, only face-to-face and maybe via telephone/videocall, and I figure that definitions of community and of conversation are pretty connected.
# 17 Apr 2019, 09:13AM: Recurse Center, What Really Works And How We Know:
I participated in Recurse Center (formerly Hacker School) in 2013 and in 2014, and emerged a better programmer, a calmer and kinder person, and a more confident learner. Gender diversity was part of the quality of that experience:
When part of the joy of a place is that gender doesn't matter, it's hard to write about that joy, because calling attention to gender is the opposite of that....
But, as Nick Bergson-Shilcock says in "What we've learned from seven years of working to make RC 50% women, trans, and non-binary", "We focus on diversity so Recursers can focus on programming.":
In April of 2012, we announced our goal to make RC 50% women. Seven years later, we are close to reaching an improved version of this goal: 48% of new Recursers in 2019 so far identify as women, trans, or non-binary. This post is a summary of what we’ve tried, learned, and accomplished over the past seven years, as well as our overall strategy and why we choose to prioritize this work.
Bergson-Shilcock's case study shares stats, what didn't work, and what they don't know yet -- the people who run RC are consistently like this, and this writeup exemplifies their judgment, integrity, and foresight. Even when I've disagreed with RC's faculty, I have always come away from the disagreement with my trust in them intact or increased. How many institutions could I describe in that way? Not many.
One last thing -- I've recently been trying to avoid saying "community" when I really mean group, set, school, industry, project, or workplace, and Bergson-Shilcock's articulation is gonna help me do that and to value substantive communities:
Having a genuine community requires that people know the other people around them, and that everyone shares some fundamental values and purpose.
# 22 Sep 2016, 03:09PM: New Essay: "Toward a !!Con Aesthetic":
Over at The Recompiler, I have a new essay out: "Toward A !!Con Aesthetic". I talk about (what I consider to be) the countercultural tech conference !!Con, which focuses on "the joy, excitement, and surprise of programming". If you're interested in hospitality and inclusion in tech conferences -- not just in event management but in talks, structure, and themes -- check it out.
Christie Koehler also interviews me about this and about activist role models, my new consulting business, different learning approaches, and more in the latest Recompiler podcast.
[announcement cross-posted from Geek Feminism]
# 14 Jul 2016, 01:08PM: A Great Explanation of WebDriver and Browser Automation:
Maja Frydrychowicz's "Untangling WebDriver and the Browser Automation Landscape I Live In" is a delightful, very satisfying read. It covers the difference between the W3C WebDriver specification and Selenium WebDriver, explains their history and future, and uses the Firefox ecology as the concrete browser example so you understand how the components fit together. Also, Frydrychowicz drops in this punchline:
and some day all browsers will implement it in a perfectly compatible way and we'll all live happily ever after.
Upon reading the post, I noted:
I look into the middle distance, more motivated, yet calmer as well. I seem to hear the opening notes of "Fanfare for the Common Man" somewhere behind me. Automated browser testing seemed overwhelming previously, something to be left to Experts who knew this strange tongue. But now I know the power is in my hands; the map gleams and names that formerly confused me now fall into place. My world makes more sense; I have better comprehension of lists like PhantomJS's list of relevant test frameworks and their corresponding test runners. What might not be possible in this fresh new light?
So, if you feel faintly alienated and unmoored when trying to understand automated browser testing, check out the post.
(I know Maja Frydrychowicz because we both participated in the Recurse Center. Want to become a better programmer? Join the Recurse Center!)
# 30 Jun 2016, 11:19AM: Ambition And Failure:
People who are trying to make stuff often feel like we're failing. Ira Glass's articulation of the gap between taste and skill gets at this. He suggests making more stuff, for deadlines, for others, as a rhythm to push you to progress through that gap. But how do you keep up your morale during that push?
I'm a Recurse Center alumna, and that community often shares learning tips that are relevant to this struggle. For instance, I recommend Allison Kaptur's "Effective Learning Strategies for Programmers", which suggests reframing failure -- and reframing praise and success. Even if the tips I get via RC are programmer-centric, I can usually reuse them in other activities, such as growing my business. And earlier this year, in Ramsey Nasser's keynote The Unfortunate Value of Failure at !!Con 2016 (transcript & video), I heard a different nuance that really spoke to me. Here's the chunk of the captions/transcript that particularly resonated:
I have the same anxieties at 29 as a programmer that I did when I was a teenager. I don't feel measurably better about myself as a programmer over the last ten years, although it is objectively true that I'm a better programmer. Just looking at my GitHub repo, I can see rationally that I have actually improved, but I was trying to figure out why I didn't feel any better.
And my understanding of it... This may be different for other people, but this is my take on it. I don't think that my feeling about my skill as a programmer is actually tied to my skill at all. It's actually tied to the things that I'm trying to do, at whatever skill level I'm at. So when I was 19, I was just trying to make websites. And it was really hard. Right? And ten years later, I'm trying to write a symbolic compiler. And that's really hard. And the diff between what you're trying to do and what you're able to do is how you feel. And as I got better as a programmer, I just kept trying harder and harder things.
So the feeling is constant. Right? That's why there's no point where everything will just feel wonderful. Because I have to do this. I would have to just make basic websites for the rest of my life. And I would feel great. My anxiety would go away. I can whip up a website really, really quickly. But that's not actually what I'm excited about anymore. So my ambitions and the things that I'm excited about grow with my skill. And that's what keeps that feeling constant. That's what it's been for me. Like, right now, I'm running this whole presentation out of custom slide show software that I wrote, and I'm terrified that it's just gonna explode. Like, eat this presentation in front of everybody. And I hope it doesn't.
So if we can't eliminate it, I think we need to learn to love it. Right? We need to embrace it as part of the craft of programming. And not as this thing to be avoided. Failure... When you fail, that means you're pushing yourself. That means you're reaching beyond what you're capable of, because you want to be better. When you're failing, you're learning and you're growing. Right? You're sort of saying to yourself... Whatever you know now is great. It's wonderful. But there's more that you want. Right? It's a sense that you haven't given up on just absorbing as much as you can. When you're failing, you're exploring things that are in that grey area. That there may be interesting surprises there, or there may be things that you don't want, but you're willing... It's a sort of brave commitment to go there and to see what's out there. Failing is not wrong.
As a homeschooling parent once wrote: "The only thing that makes you smarter is doing hard things." (From the same parent: "I do think that one of the greatest educational gifts I can give her is confidence that she can seek out challenges and master them." and: "being out there on the edge of what you maybe-can't do. That's the place that you value, because that's where you stretch".)
# (1) 21 Dec 2014, 11:10PM: Why You Have To Fix Governance To Improve Hospitality:
Fundamentally, if you want to make a community hospitable,* you need to work not just on individual rules of conduct, but on governance. This is because
- the particular people implementing rules of conduct will use their judgment in when, whether, and how to apply those rules, and
- you may need to go a few levels up and change not just who's implementing rules, but who's allowed to make rules in the first place
Wait, how does that work?
In my Wiki Conference 2014 keynote address (available in text, audio, and video), and in my PyCon 2014 poster about Hacker School, I discuss how to make your community hospitable. In those pieces I also mention how the gatekeeping (there is an initiation/selection process) and the paid labor of community managers (the facilitators) at Hacker School help prevent or mitigate bad behavior. And, of course, the Hacker School user manual is the canonical document about what is desired and prohibited at Hacker School; "Subtle -isms at Hacker School" and "Negative comments" have more ruminations on how certain kinds of negativity create a bad learning environment.
Sometimes it's the little stuff, more subtle than the booth babe/groping/assault/slur kind of stuff, that makes a community feel inhospitable to me. When I say "little stuff" I am trying to describe the small ways people marginalize each other but that I did not experience at Hacker School and thus that I noticed more after my sabbatical at Hacker School: dominance displays, cruelty in the guise of honesty, the use of power in inhospitable ways, feeling unvalued, "jokes", clubbiness, watching my every public action for ungenerous interpretation, nitpicking, and bad faith.
You can try to make rules about how things ought to be, about what is allowed and not, but members of the incumbent/dominant group are less accustomed to monitoring their own behavior, as the Onlinesmanship wiki (for community moderators) reminds us:
Another pattern of the privileged: not keeping track of the line between acceptable and unacceptable behavior. They only know they've crossed the line when someone in authority tells them so. If this doesn't happen, their behavior stays bad or gets worse....
Do not argue about their intentions. They'll swear they meant no harm, then sulk like fury because you even suggested it. In most cases they'll be telling the truth: the possibility that they were giving offense never crossed their minds. Neither did any other scenario, because unlike real adults, they take no responsibility for getting along with others. The idea that in a cooperative work situation, getting along with one's fellow employees is part of the job, is not in their worldview.
This too is a function of privilege. They assume they won't get hit with full penalties for their first offense (or half-dozen offenses), and that other people will always take on the work of tracking their behavior, warning them when they go over the line, and explaining over and over again what they should have done and why. It's the flip side of the way people of the marked state get hit with premature negative judgements (stupid, dishonest, sneaky, hysterically oversensitive) on the basis of little or no evidence.
And, in any community, rules often get much more leniently interpreted for members of the dominant group. And this is even harder to fight against when influential people believe that no marginalization is taking place; as Abi Sutherland articulates: "The problem with being lower on an unstated social hierarchy is that marginal judgment calls will
reliably go against you. It's an excusable form of reinforcement."**
Changing individual rules isn't enough. After all, individual rules get made by particular humans, who -- here, instead of babbling about social rule system theory at you, I'll give you a sort of sidebar about three successive levels of governance, courtesy of my bachelor's degree in political science:***
- Actors: The actual set of people who run an organization or who shape agendas, on any given day, have particular ideas and policies and try to get certain things done. They implement and set and change regulations. Actors turn over pretty fast.
- For example, in its five-year history, Hacker School has had employees come and go, and new participants have become influential alumni.
- Dominant worldviews: More deeply and less ephemerally, the general worldview of the group of people who have power and influence (e.g., Democrats in the executive branch of the US government, sexists in mass media, surgeons in operating rooms, deletionists on English Wikipedia) determines what's desirable and what's possible in the long term. Churn is slower on this level.
- For example, dominant worldviews among Hacker Schoolers**** include: diversity of Hacker Schoolers, on several axes, helps everyone learn more. Hiding your work, impostor syndrome, too much task-switching, and the extrinsic motivation of job-hunting are common problems that reduce the chances of Hacker Schoolers' success. Careers in the tech industry are, on balance, desirable.
- Rules of the game: What is sacred? What is so core to our identity, our values, that breaking one of these means you're not one of us? The rules of the game (e.g., how we choose leaders, what the rulers' jurisdiction is) confer legitimacy on the whole process. Breaking these rules is heresy and amending them is very hard and controversial.***** Publicly disagreeing with the rules of the game costs lots of political capital.
- For example, the rules of the game among Hacker Schoolers, as I see them, include: the founders of Hacker School and their employees have legitimate authority over admissions, hiring, and rule enforcement. Hacker School is (moneywise) free to attend. Admission is selective. A well-designed environment that helps people do the right thing automatically is better than one-on-one persuasion, which is still better than coercion.
(Where do the four Hacker School social rules fall in this framework? I don't know. Hacker School's founders encourage an experimental spirit, and I think they would rather stay fluid than accrete more and more sacred texts. But, as more and more participants have experienced a Hacker School with the four social rules as currently constituted, I bet a ton of my peers perceive the social rules as DNA at this point, inherent and permanent. I'm not far from that myself.)
(I regret that I don't have the citation to hand, and would welcome the name of the theorists who created this model.)
So, if you want a hospitable community, it's not enough to set up a code of conduct; a CoC can't substitute for culture. Assuming you're working with a pre-existing condition, you have to assess the existing power structures and see where you have leverage, so you can articulate and advocate new worldviews, and maybe even move to amend the rules of the game.
How do you start? This post has already gotten huge, so, I'll talk about that next time.
* I assume that we can't optimize every community or activity for hospitality and learning. Every collaborative effort has to balance execution and alignment; once in a while, people who have already attained mastery of skill x just need to mind-meld to get something done. But if we want to attract, retain, and grow people, we need to always consider the pathway to inclusion. And that means, when we accept behavior or norms that make it harder for people to learn, we should know that we're doing it, and ask whether that's what we want. We should check.
**See the second half of "One Way Confidence Will Look" for more on the unwillingness to see bias.
*** I am quite grateful for my political science background -- not least because I learned that socially constructed things are real too, which many computer science-focused people in my field seem to have missed, which means they can't mod or make new social constructs as easily. Requisite variety.
**** A non-comprehensive list, of course. And I don't feel equal to the more nuanced question: what beliefs do the most influential Hacker Schoolers hold, especially on topics where their worldview is substantially different from their peers'?
***** The US has a very demanding procedure for amending the Constitution. India doesn't. The US has had 27 amendments in 227 years; India, 98 in 67 years. I don't know how to interpret that.
# (2) 15 Dec 2014, 12:06PM: A Code Review Group:
I'm interested in piloting a peer code review group, structured like a writer's group. So next month I'm starting one out in New York City, starting with Hacker School alumni and participants, and I figured I'd put some logistics and reasoning here for my own future reference and to help anyone who'd like to do something similar.
Basics: Part of the point of a writers' group is to get participants to produce work consistently, and part of the point is to help everyone learn craft -- the authors and the critiquers. So I'm trying out a similar structure for this pilot. We will meet in person; I think that criticism is often a lot easier to take in person, and I know it's easier for me to take in person. We'll meet about every 3 weeks, midday Saturday or Sunday, to critique two works of code. We'll have a rotating schedule of who's responsible for writing code and who's responsible for reviewing it; I am maintaining that schedule. (I'm copying the frequency and format from writers' groups like Leonard's.) I figure we'll run this with about 5-6 people for four months, hopefully giving each person a chance to have their code reviewed twice, and then reevaluate and see what to keep, change, or give up.
Who?: It felt natural to me to start this in the Hacker School community. Anyone who's going or gone to Hacker School is someone who accepts the social rules we've set up to make learning easier, and is generally collaborative and friendly. Also, alumni can use Hacker School for stuff like this outside of normal work hours, which means we can use a HS conference room (and projector!) for the group meetings.
What language?: Since Python is the only language I am fluent in and it is the language I'd prefer to work in and grow in, most code we review will be in Python. I consider myself an intermediate Python programmer (very comfortable writing list comprehensions, but still need to stop and look up exception-handling syntax when I need it; see "mcmasala" for a recent code sample). Fortunately, Python's enough of a lingua franca, and there's a wide enough variety of skill levels in the Hacker School community in New York, that several programmers were willing to sign up to an intermediate-and-higher Python-specific group. After the pilot period, I think the group will evaluate the idea of expanding to other languages, and see how we feel about skill heterogeneity.
New code only?: I'm not sure whether people will end up submitting already-written or fresh code for the group to critique. I personally think that it would be fine to circulate bespoke and/or already-working-on-it code. Sometimes I might be working on something that's so huge that it doesn't make sense to extract a small-enough chunk of it for peers to review, so I'd write something from scratch instead. Sometimes I'd really want my peers to look at something that I have already been noodling with. I'm curious what other code review groups have found when experimenting on this axis.
Submission length?: My current wild-ass guess is that each submission for review should be somewhere between 32 and 2048 lines of code, but, given that this.py (as in
import this) is ~6 lines other than a giant string, I am happy to deal with codebases of lengths 4-31 as well, for the length of the pilot. :)
Time commitment?: As far as I can tell, here's the format and time commitment for this pilot:
- Everyone: in-person meeting every three weeks for about four months, January-April -- probably about 60-90 minutes long each time, with about 30-45 minutes for each work (the critiquers offering praise and criticism of the work, and the author responding at the end).
- Per person: Twice during the pilot: writing code and emailing it (or a link to it) to the rest of the group, a week ahead of the meeting.
- Per person: Before every meeting, so, about 5 times: reviewing the author's or authors' code ahead of time and writing out notes, so it's easier to give specific praise and criticism at the meeting, and to email to the author(s) afterwards. (I say "author's or authors'" because even if you're one of the two authors who submits code for a particular meeting, you'll still have to review the other author's code for that meeting.) Writing a critique will probably take the participant at least 30 minutes per critique.
- Organizer: a few hours total of scheduling, sending nag emails, and writing writeups like this one. :)
I'm opening comments on this post specifically to hear from other people who have participated in code review groups, about what has worked and not worked for you. And of course other people should feel free to reuse bits of these ideas to start groups that meet online, or go multilingual, or meet more or less frequently, or what have you!
# (1) 18 Nov 2014, 04:21PM: Using Beautiful Soup, Pystache, and Lunr.js for an Archival Site:
My third week of my 2014 Hacker School batch, I decided to take on a project that I'd originally thought about doing a year before, during my first go at HS.
Between April 2005 and August 2007, I wrote a weekly column called "MC Masala" for the "Inside Bay Area" section of several papers in the San Francisco Bay Area, including the Oakland Tribune. My work circulated to about a million people, I'm told. A few years ago I grabbed a softcopy of almost all my archives off a periodicals database, and then in 2011 I made an abortive attempt to get the columns online, but gave up on all the fiddly textmunging bits.
But a few weeks ago I felt ready to make a go of it, and I figured this would be a fun and useful way to learn Beautiful Soup and learn to finagle a search engine. So I basically stopped doing the Matasano crypto challenges and started a new project.*
Beautiful Soup, Pystache, and sed
I wrote a script to take a list of HTML files of my old newspaper columns and scrape them using Beautiful Soup. (I only needed a tiny bit of live help from Leonard -- to whit, he got me to use the html5lib parser instead of the default.) My script output a Python dictionary containing the stories as structured data: headline, date, & body. And I wrote a script to render that data through Pystache templates I wrote and write an HTML file for each story, plus a table of contents page. (I don't intend on adding comments or starting the column back again, so I didn't think I'd want a CMS. Pystache, the Python implementation for lightweight Mustache templates, seemed like a reasonable choice.) I got some help on this, notably from a pairing session with Chase Lambert on testing Unicode stuff, and from a pairing session with Geoff Shannon on a Pystache type and inheritance problem.
Unfortunately I never quite figured out how to get one Pystache template nested in another, so there's some code duplication (perhaps partials are the answer). And I had to hack my way around some loopback issues so as to put chronological next/previous links on each article. (Story URLs are just kebab-cased dates. So, my script gets the headline and date (and thus the URL) of the next or previous story by traversing a date-sorted list of dates-and-headlines dicts, then renders the dates and URLs into variables in the template. Oh right, this is where a CMS would have been nice! Lightweight is great until it's not.)
(In the course of all this, I (with help from a sed FAQ) wrote my first real honest-to-goodness "changing a bunch of files in-place with sed" one-liner in years or possibly ever. A ton of links in several files were pointing to the parent directory instead of the current directory. So:
sed -i '/head/s/\.\.\///' *.html means "In-place, change
../ to nil, in all the
.html files in this directory." Whoo!)
The look, the feel
(There was a cotton ad on TV when I was a kid, with the jingle, "The look / the feel / the fabric of our lives." Sometimes Nandini and I sing it to each other. I suppose if there were an ad for Cascading Style Sheets on TV today it could use the same motto.)
I wrote the stylesheet and arranged the proper elements in the template with a bunch of help from Mozilla Developer Network's guidance on boxes and tables, and that old standby, CSS Zen Garden. I gratefully and curiously perused several nice-looking styles for inspiration and edification. I now more thoroughly understand the difference between margin and padding, and grok better why modern sites have a zillion
For a "home" image, I used a picture of me that Valerie Aurora took, and for a header decoration, I used the GNU Image Manipulation Program to stitch together repetitions of a photo that Kitt Hodsden took and blogged in 2012.
I've made database schema decisions before, but I haven't previously decided on search indices. It was cool that I had the power to change up the parsed output once I realized that the structured data ought to have hrefs as the unique IDs, rather than otherwise-useless unique doc IDs.
MC Masala is live! I am so happy that these columns have a nice home now, and that I made it. I got to exercise my Python, which is strong, and I got to strengthen a bunch of other skills along the way. It's not perfect, and I have a TODO list, but it's the nicest-looking site I've ever made, and it fulfills its function well. And I made it in just a few days.
* I basically stalled on the Matasano challenges, and will come back to them someday when I don't feel so time-constrained. I did get some use out of doing the ones I did! I have now grokked byte-level stuff much better, and learned about bytearrays thanks to Allison Kaptur. And I got some laughs out of the process. Example: In challenge six, the Hamming distance the player calculates should be 37. First attempt: came up with 14. Next: 598. I literally laughed aloud. Then, when I finally got 37, I thrust my arms into the air with great vigor because I WAS A DEITY OF PURE LIGHT. But then I started getting depressingly wrong answers and kept getting them; I got help from friends, but decided to hold off and only look at one friend's potentially-spoilery explanation when I'm ready to come back, and I still haven't looked at it. I tried to remind myself of a sort of Allison Kaptur/Carol Dweck "the edge of maybe-can't/"The only thing that makes you smarter is doing hard things" attitude, that I am a Joseph Campbell hero and the greater my struggle the greater my triumph will be. But I was tearing up in frustration, and I decided to give myself a rest from crypto and level up on the main skill I'd come to Hacker School to learn, namely, webdev. And I think that was the right decision. You gotta manage your own morale and momentum -- that's a resource too.
# 18 Nov 2014, 01:01PM: A Node.js Project, And Deciding to Shelve It:
In my second week of my 2014 Hacker School batch, I asked:
What are red flags in scifi/fantasy magazines' calls for submissions? What words/phrases make you think "ew, avoid"? -- @brainwane, 3:48 PM - 13 Oct 2014
As Moss guessed, I was thinking of making an SF&F version of joblint.org, to automatically check for suspect wording in "please submit" pages and posts by speculative fiction publishers.
I take off my hat to Rowan Manning for creating the tool and the site, which I found easy to adapt (my fork of the tool, my fork of the site). The code's in Node.js, and despite an npm problem on Ubuntu, I found it fairly easy to figure out how to change the tests, regular expressions, and error messages, modify the package dependencies and update appropriately (especially thanks to Hacker School colleagues). Check it out:
But conversation with some SF&F community members led me to believe that the joblint approach wouldn't help here. In tech industry job descriptions, you can rely on certain buzzwords and key off them; joblint should be only part of a suite that catches problems, the way a code linter should be in a software engineering process, but it prookes thought and is useful on its own. But problems with SF&F calls for submissions are often in subtler approaches rather than easy-to-match strings. So it didn't feel worthwhile for me to try for a regexes-alone approach, and I didn't want to spend my Hacker School time thinking though the automated literature analysis part of this problem; that's not what I wanted to do in this batch.
So I shelved the project and I have not gotten it even close to launch. But the code's up with a TODO list, and y'all should feel free to grab it and run with it if it strikes your fancy!
# (1) 18 Nov 2014, 10:46AM: Things I Learned About Drupal And Odd 404s:
Back on October 7th, I offered "Some Tips On Domain Names And Hosting", and said: "So, next step: choosing a provider, spinning up a server, loading it up, and pointing my new domain name at it!" And then an interesting unexpected thing came up, which takes up the majority of this post (see the "Weird spam and HTTP tricks" section).
I chose DigitalOcean mainly because a peer had a $10 referral coupon thing, so I could for free enjoy the benefits of using a service that has a business model that makes sense and won't get all ad skeevy (relevant rant, parts one, two, and three).
I faced some two-factor auth problems basically because the most convenient 2FA solutions assume you are fine with installing a closed-source app on a computing device you control.
Also, when spinning up a DigitalOcean droplet for the first time and SSHing into it, I'd like to establish the authenticity of the host by verifying the ECDSA key fingerprint. Where in one's digitalocean.com settings or in the web UI should one look to find that? The answer: one can't. I looked on the web and asked around, and found a lot of people saying, "when you get to 'the authenticity of this host cannot be established, are you sure,' just say yes." There is apparently no way to verify that key fingerprint in the web UI. The attack vector is microscopic (someone else coming in and spoofing the IP address right after you spin it up and before you have a chance to SSH in). But it still annoys me. I hear Amazon EC2 has solved this problem and does give you a way to verify the fingerprint.
I followed some useful tutorials to refresh my memory so I could set up an Ubuntu server and get a LAMP stack installed. Another helped me install Drupal. I have now successfully installed Drupal!
Generally, if you want to make Drupal do what you want it to do, it's helpful to install modules that other people have made, and maybe themes. You can check out popular modules such as Views, and you can look up how to install modules and themes, and learn how to install modules and themes specifically in Drupal 7.
Thanks to much help from Fureigh (example), when I looked up an "installation profile" ("ngpprofile") that interested me, I found out about Drush and installed it. It seems as though drush wants or seems to need to do everything as root, which doesn't feel right to me, so maybe I misunderstood. Then again, a sysadmin of my acquaintance mentioned his "you gotta be kidding me" reaction to a Drupal installation HOWTO that blithely said "now
chmod 777 the web directory", so maybe I just have a different attitude to privileging than Drupal does! Some more thoughts on Drush: a slide deck, GitHub, a homepage, and a project page.
And Fureigh submitted a patch to get ngpprofile to work properly with Drush! ... And then I ungratefully did not try to use ngpprofile, and instead looked at a very very simple theme, and then fiddled manually with templates and the admin dashboard to make my site look just slightly different from a regular stock Drupal site. Drupal theming seems to be a pretty deep skill in and of itself.
I got help from the
#drupal-support IRC channel on Freenode as I went -- thanks! If I ever dip into Drupal again, I'll check out a video resource they recommended, including a "build your first Drupal 7 website" video sequence.
Weird spam and HTTP tricks
I bought a brand-new domain name via Hover and pointed it to my DigitalOcean droplet. The next day, I looked at various admin logs and noticed strange 404s that had nothing to do with my site. Clearly they were spam and the attackers hoped I would click on their URLs thinking they were referrers, or similar (if the attacked site's 404 logs are public, intentionally or accidentally, then this tactic would increase the spammer's pagerank). I'll reproduce one here, with the actual URL replaced with "myphishingsite.biz" and eliding the IP.
Hmmm. The spammer left their URL in the LOCATION field somehow, but there's no referer (Drupal spells it "referrer in the admin console). I found that I could cause a "page not found" log entry by going to a nonexistent page on my site, e.g.
TYPE page not found
DATE Thursday, October 9, 2014 - 10:46
USER Anonymous (not verified)
HOSTNAME [IP address elided]
/bleeber, but then the LOCATION for that log entry was
http://[hostname.tld]/bleeber. How was the spammer manufacturing an entry with a LOCATION of
http://myphishingsite.biz? And what was up with the truncated initial "h" in the MESSAGE field?
With a few pointers from two Hacker School colleagues, a bit of reading up on how Drupal logs 404s, what access logs look like in Apache, and what 404 actually means, and some trial-and-error, I began to see what was happening. If I went to http://myhostname.tld/http://panix.com , then my access logs included
GET /http://panix.com . But the attacker sent requests that logged as
GET http://[spamsite] (notice that there is no leading
/). So I began to suspect that the attacker programmatically sends
GET requests with some kind of intentionally malformed header. (And then this helped me explain why, in the report overview in the web-based admin console, the spammed URLs miss their first character (the h in http) -- usually you don't care about the leading slash or about the base URL when you're skimming that overview, so Drupal programmers made some kind of "omit the first character" choice.)
Time to break out
netcat! Usually, the first string after
GET in an HTTP request header is the location of the resource you want on the host that you're sending the request to (below, "myhostname.tld" is the host that I'm sending the request to). You'll often see
GET / or
GET /favicon.ico, for instance. But there's no reason you can't do something like this:
$ nc myhostname.tld 80
GET http://berkeley.edu HTTP/1.1
When I sent that HTTP request manually, I could replicate precisely what the spammers were doing, in terms of what characters showed up or got clipped in the relevant logs. For instance, the access log entry:
[IP address elided] - - [11/Oct/2014:16:23:47 -0400] "GET http://berkeley.edu HTTP/1.1" 404 7574 "-" "netcat"
And if I were specifically attacking Drupal administrators and wanted them to click on things, and I knew about the initial truncated character in the web-based admin console view, I might send a
GET request that includes an initial character to throw away:
$ nc myhostname.tld 80
GET /http://nyc.gov/ HTTP/1.1
So, my first week of my second Hacker School batch, I succeeded in learning a bunch about using the domain name system, hosting, and Drupal, AND I learned how to do hilariously wrong things with HTTP requests. (The site isn't up anymore, because that wasn't the point.) I then went on to build some more sites with different tools, and I'll blog about the rest of them in upcoming posts.
# (1) 14 Nov 2014, 04:07PM: Sometimes Paths Are Useful:
I just finished a six-week batch at Hacker School. As an alumna, I had the option of asking to come back for three months or for a six-week minibatch, and I decided on the latter. I'll be writing more about my lessons, but today I can mostly point to my programming partner's writeup and add a silly story.
I met Greg Hendershott at !!Con months back, and then we ended up in the same batch and found that we laugh at each other's jokes. So we tried to figure out what to work on together. He's way into functional programming, Racket, Clojure, stuff like that, and has for instance written an emacs mode for Racket. In contrast, I'm only fluent in Python and have been concentrating on web dev. We found common ground in Python and an interest in security, and made a webservice that runs a static analyzer on a user-submitted code sample and returns to the user a "report card" of vulnerabilities in their code. That's what I spent the last two weeks on.
In his post, Greg describes how we rejected smaller and smaller web frameworks, finally settling on subclassing from
BaseHTTPServer (built into Python's standard library). When you do that, you have to literally define methods so that the server can handle even the most basic HTTP verbs, like
POST. We defined
POST but didn't define
GET, because we didn't need to! It felt so tremendously subversive, creating a web service that gave you a 501 (Method Not Supported) if you tried to
GET / , and yet actually did other things. Deliciously wrong.
(Also amazing: reading and subclassing from code whose initial code comments specifically and relevantly cite the work of Tim Berners-Lee and Roy Fielding. I felt such awe and gratitude, that I am part of a grand heritage of innovation and infrastructure. What an inheritance!)
So then a few days later we decided to make a simple web page or two, so that someone using a web browser could use the service. I loved the experience of API-first design, and felt amused when I implemented our server's second method,
do_GET. (One nice thing about long-term collaboration is that you can pair some of the time and also do some bits on your own, bringing them to your partner for code review.)
do_POST, didn't care about the path, because there's only one thing a user is ever going to do with our service. No URL routing required. A
GET request always caused the server to return index.html.
Then I stubbed out a small index.html page, borrowing bits and pieces from other past projects where I'd solved similar problems. And I thought "well I'll style this a bit" and copied a style.css file from one of my old sites into the project directory, linked to it in the
head element of index.html, futzed with some element names and IDs, and reloaded. Hmm, why no styling? Shift-reload. Still looked bare. I opened up the developer toolbar...
...and saw that "style.css" had the text of index.html. Because I had defined
GET to always return index.html! And when you want a browser to be able to use a stylesheet, well, it'll have to
I laughed pretty hard, then inlined the CSS. (And we did end up writing a bit of URL routing so we could serve a favicon to browsers and to serve a capabilities document to service clients.)
I get so much joy out of playing with the building blocks of the Web. It's a great feeling. Thanks for working on this with me, Greg!
# 04 Nov 2014, 01:04PM: .illusion():
Last night one of my Hacker School peers was practicing sleight-of-hand with a card deck, and another peer walked over and said, "Oh, I used to run a magic tricks website."
I waited with bated breath for the punchline. None came! So I had to make some up.
I used to run a magic tricks website, but it disappeared.
I used to run a magic tricks website; I wrote it in Haspell.
I used to run a magic tricks website; it ran RabbitMQ.
I used to run a magic tricks website; I used SQLAlchemy. (predicated on the false memory that SQLAlchemy's logo is a tophat and cane)
I used to run a magic tricks address book application; pick a .vcard format, any .vcard format!
I used to run a magic tricks website; this is my lovely helper function.
But I felt stymied. When I think of magic tricks, I think of visuals and descriptions, not easy-to-pun jargon. And I couldn't think of any puns on the names of GOB Bluth, Penn and Teller, David Copperfield, or Criss Angel/Mindfreak.
And then Cerek Hillen came up with: "I used to run a magic tricks website; I wrote it in Brainfreak." And I thought: yes. It is done.
# 04 Nov 2014, 11:49AM: Vestiges:
I know some Russian, some French, and some Kannada, and every once in a while, my vocabulary fractures and I say a word from some other language. "Nodu" is Kannada for "look" (imperative second-person), and to this day, if I want to point something out to an interlocutor, I'll find myself saying "Nodu." (By now I think Leonard's learned that bit of Kannada through repetition and pattern-matching.)
I know some Python, some Bash, and some Scheme, and every once in a while, as I typetypetype in a Python file in emacs, I'll find myself wanting to
car to get the first element from a list, or wanting to pipe (
|) the output of one function into another.
# (1) 31 Oct 2014, 04:59PM: A Few Intermediate Git Tips:
Today I led an intermediate Git workshop at Hacker School, with occasional help from more experienced Git users. We covered:
- cherry-picking versus merging a commit from one branch to another
git blame [filename] to see who last touched a line
git log --full-diff -p [filename] to view full diffs, and a few cool things to put in your
.gitconfig to better view your log, e.g., aliasing something to
log --oneline --graph --all --decorate -30
- better search with
git grep, and file listing with
git ls-files, to only look at the files in your repository (thus ignoring files mentioned in your
git add -p to make your commits cleaner and improve your pull requests (with thanks to this blog post by Allison Kaptur)
git rebase -i to rewrite history in your branches and thus also improve your pull requests
- shallow cloning with
git clone --depth 1 (demonstrating that it is faster and takes less disk space, but this took a few tries, since Git is so efficient at storing past revisions that the effect barely registers for small, young repositories)
git reset and the differences among default,
- ways to talk about history and what
git rev-parse does under the hood (and thus
HEAD^2 and parents and ancestors and whatnot)
Only afterwards did I see this super useful explanation of the Git model which articulates what's actually doing what.
As we were discussing rebase, I said I didn't yet feel smart enough to do non-interactive rebases. My peer Connor frowned at that. I sought a replacement word. Skilled? Experienced? Audacious? Confident? Maybe that last one.
I'm also going to play around with the
gitk GUI tool, maybe with
git bisect. And I heard a brilliant suggestion: when you're about to do something in Git that feels scary, in terms of rebasing or resetting or whatnot, clone your repo and try out your idea on the clone!
# 20 Oct 2014, 06:56AM: Hacker School Miscellanea:
Found in an email I sent a few years ago: "I'm freaking 30 now, so I have decided to be Mature, stop feeling bad that I don't learn stuff well on my own, and take classes that play to my predilection towards collaborative structure." As it turns out, I think "don't learn stuff well on my own" was an oversimplification; approximately no one truly learns on their own, after all; I needed a more synchronous community rather than a purely asynchronous one.
Found in an old blog draft that I will never turn into a proper post:
context manager - "
with x as y" (especially for files)
modules that are often useful -
requests, os, sys, time, datetime, codecs, unittest
git add -p
What it looks like to merge a pull request
Written? Kitten!'s code uses localStorage
Laura Lindzey blogs about whether she'd do Hacker School again; her answer is that she would not, though she loved it, because "Programming is no longer the thing I struggle most with." I smiled at the very last item on her list of things she particularly wants to learn about right now, because I'm genuinely comfortable with my skills in that area and that's one reason I can take a break from it to be at Hacker School.
My batchmate Alyssa Carter has the best About page I have seen in eons.
I got stuck on the sixth of the Matasano crypto challenges last week. I'm going to take another look at it this week now that I've cried a bit, gotten a new perspective from Alex Clemmer, and spent the weekend in Rhode Island at a friend's wedding reception. Gosh those trees are pretty right now, perfectly autumnal. I'm also eyeing Natas which is more directly the type of serverside web security game that piques my interest. All this on top of the main thing I'm doing during Hacker School this go-round, webdev play.
# 07 Oct 2014, 02:00PM: Some Tips On Domain Names And Hosting:
Here are some things I recently learned or re-learned about setting up your own website.
There are a ton of domain name registrars out there and a lot of them are subsidiaries of Tucows. At least one acquaintance of mine uses NameCheap and finds it low-fuss with a reasonable web UI. I decided to try Hover since they have, in the past, sponsored the In Beta podcast. You will often expect to pay about USD$10 per year, though sometimes you get deals (".club" was $5 through Hover when I last checked).
As long as I was futzing with domains, I decided to transfer over an old domain name to Hover. In order to do that, I had to obtain the auth code, a.k.a. EPP (Extensible Provisioning Protocol) code from my old registrar (the "losing" registrar). Sometimes this should be visible in the web UI when you log into the losing registrar's site. Sometimes you'll have to phone in. And then you might get a shock, because registrars evidently think it's totally okay and normal to ask you for your account password in order to authenticate you, and to send the EPP code over plaintext email. Sadface. But at least some vendors, including Hover, offer two-factor auth! And the two-factor auth applications can live on my laptop or some other device, not necessarily my phone (which is good because I haven't yet checked whether there's a 2FA app for MeeGo but I doubt it).
Once you transfer a domain, it takes maybe 24 hours for the change to propagate; after that, the losing registrar has no residual effect on the domain or on DNS (Domain Name System) resolution.
I found Maciej Cegłowski's "The Five Stages of Hosting" helpful. Right now I'm interested in hosting a reasonably simple joke site, and in learning a bit about sysadmin and deployment, so I want to be able to SSH into a standard-ish Linux machine and set up Drupal or WordPress or similar, and I don't expect my site to need to scale. So I will go with a VPS (Virtual Private Server) provider, under the "dorm room" model in Cegłowski's framing. Stan, my Hacker School colleague who let me interview him to learn this stuff, is most familiar with Linode and Digital Ocean.
I am going to act as my own sysadmin for this site, so I'm going for "unmanaged" hosting. Most VPSes offer you "unmanaged" hosting by default, in which you can only ask the provider, e.g. Linode, for help if the problem is their fault (e.g., "hey, I don't seem to have an IP address anymore!"). "Managed" means you have access to a sysadmin but you pay, say, $100 per month (sometimes less). This person performs tasks such as incident response, fixes if the site goes down at 1am, and help switching you to a new database. The point is that it's cheaper than hiring a full-time sysadmin.
Unmanaged VPS services seem to run about USD$5-20 per month, if they're flat rates, as Digital Ocean provides. (Evidently Digital Ocean caused a bit of a price war when they entered the market, so prices are lower now.) If your VPS operates on a utility model, where you pay for the resources your site consumes, then you have to watch out for spikes that run up your bill. Some services will also offer a backup service, either for free or as a paid add-on.
Linode has a good reputation for very fast customer support; they have often responded to support tickets in under five minutes. Digital Ocean also seems pretty quick. And it's helpful to have a big community of other users who can help you figure stuff out. Linode and DigitalOcean have active IRC channels and web fora, and the Linode Library and Digital Ocean's text resources cover a lot. Amazon EC2 has a huge community of existing users.
Hosting providers also compete on security, or at least they should. Several providers offer two-factor auth. One good signal: having a bounty program, where the company welcomes and pays for vulnerability reports (example: GetClouder's beta program). After watching Matthew Garrett's "Freedom, Security, and the Cloud" talk at Open Source Bridge 2014, I understand that a published security policy also sends a strong positive signal. And I hear that Linode is on its way back up after a few black eyes in this area, and has shored up its security. (Also, some people are beginning to use Docker on production sites, partly for convenient environment management, and partly for additional security. But the Docker developers don't really promise you more security, I gather. And I don't quite get what Docker is, yet, and may look into it. It's not really a virtual machine; it's more like a super-intense and very guarded virtualenv; I'm told it's like a chroot jail but I won't understand that till next week or so.)
For various reasons, security being one of them, when you get an unmanaged VPS, you get a "bare bones" Linux box with, say,
vi on it, but not much else. You decide what software you want on that server. And on most VPSes, there's some set of (perhaps community-written) templates, scripts, or recipes for common types of setups you might want, e.g., a simple WordPress blog. These sound a bit like Chef or Puppet to me, but usually aren't. You can activate one of those scripts to run only on the initial boot of the box; you can also write your own, and use includes to nest/point to other scripts. (Since I'm trying to learn a bit of sysadmin, I'll look at those templates, but install the software more manually.) I am not quite clear yet on whether I choose those via the web UI or something more esoteric; maybe it varies per provider.
For some actions you'll need to use the web UI. For instance, once I own my domain name and I have a VPS account and a server set up, I'll need to tell my registrar that my domain's nameservers should point to the hosting provider's nameservers, e.g., ns1.linode.com. And then I'll need to log into the VPS's website and tell them what the IP address of my server is -- evidently there are "zones" and whatnot, but I haven't gotten that far. Stan confessed that he likes Linode's and Digital Ocean's web UIs a lot better than Amazon EC2's.
Speaking of Amazon: I today finally straightened out my understanding of the Amazon hosting services taxonomy!
- Amazon Web Services (AWS): an umbrella term for everything.
- S3 (Simple Storage Service): just for serving static files.
- EC2 (Elastic Compute Cloud): the thing most people are talking about when they mention AWS. It's "elastic" in that you can use software to tell Amazon to bring some more resources online to serve your needs, and you don't need to physically haul plastic and silicon around, but you do need to explicitly manage that elasticity as needs change, as is the case for about all VPSes.
And now I understand more about "elasticity". Heroku et alia (the "Monasteries" as Cegłowski calls them) provide more insta-elasticity, as the provider senses your growing or waning needs and accords you commensurate resources. Many monasteries offer a free tier, but costs can grow rapidly (cost evidently played a part in the RapGenius/Heroku tiff).
(If you just want to run a reasonably simple WordPress/Drupal/similar web app on your site and don't need or want to SSH in, there exist hosts like Dreamhost; one Dreamhost plan offers you FTP plus a web UI. For another variation, you could do what my friend Skud does, and use Dreamhost VPS to get SSH and, say,
cron, but not root or
sudo. That's a decent compromise for Skud; they can use it for their personal stuff (mostly WordPress and MediaWiki), set cronjobs for backups, write scripts, and generally poke around in the file system, but they can't install stuff or configure major services, since one must set up new user accounts, mailing lists, or web hosts via a web UI config panel.)
So, next step: choosing a provider, spinning up a server, loading it up, and pointing my new domain name at it!
Thanks to Stan Schwertly, a fellow Hacker Schooler, for talking me through a bunch of the hosting stuff! All errors and oversimplifications are my own.
# (3) 13 May 2014, 03:24PM: Dipping My Toes Into PHP:
This week, alumni like me get to spend time at Hacker School. Since I work on MediaWiki-related documentation and I've never programmed in PHP before, I decided to start understanding just enough PHP to be able to read it better. Jordan Orelli from Etsy, a fellow alumnus, was kind enough to give me several pointers, and to especially help me understand how a PHP programmer's experience differs from my experience as a Python programmer.
I have learned, for instance:
Much thanks, Jordan! This is all oversimplified for clarity, etc., etc. I think next up I am going to try to understand a bit of PHP syntax, and the role of PEAR.
# (1) 20 Mar 2014, 11:41PM: Why I'm Excited About !!Con:
Some get-togethers turn into dominance displays -- participants see each other as someone to defeat. We often see this pattern in technical spaces, such as conferences, mailing lists, programming classes, and code review. Skud's 2009 piece "The community spectrum: caring to combative" mentions a few groups who created caring technical subcommunities in response to a competitive or combative culture. Since 2009 we've seen more such efforts -- more and more tidepools where I feel welcome, where I gather strength between trips into the ocean.
Hacker School recognizes that dominance displays discourage learning. For years, Hacker Schoolers have worked to "remove the ego and fear of embarrassment that so frequently get in the way of education", to replace constant self-consciousness with a spirit of play. (Apply now for summer or fall!) During my batch, my peers and I balanced plain old webdev/mobile/etc. projects with obscure languages, magnificently silly jokey toys, and pure beauty. We made fun in our work instead of making fun of each other.
No one "wins" Hacker School. There is no leaderboard. Whenever possible, Hacker School culture assumes abundance rather than scarcity; attempts to rank projects or people would defile our ecology.
And now we have a conference, !!Con, with that same philosophy. It's by Hacker Schoolers but open to anyone* and encouraging talks by everyone.
I love that the !!Con organizers are designing this conference to inclusively celebrate what excites us about programming. If we learn and enjoy ourselves by writing implausible or derivative or useless or gaudy code, and by sharing it with others, the proper response is to celebrate. By focusing on sharing our personal experiences of joy, we let go of dominance-style objective ranking (which is impossible anyway), and instead celebrate a diverse subjectivity. The organizers' choices (including thorough code of conduct, welcoming call for proposals, and anonymous submission review) reinforce this.
I think about this stuff as a geek with many fandoms: programming, scifi, tax history, feminism, open source, comedy, and more. In the best fannish traditions, we see the Other as someone whose fandom we don't know yet but may soon join. We would rather encourage vulnerability, enthusiasm and play than disrespect anyone; we take very seriously the sin of harshing someone else's squee.
This is the fun we make. Not booth babes, not out-nitpicking each other, but wonder.
So, I'm submitting talks to !!Con, and I'm going to be there, May 17-18, soaking in this new warm mossy tidepool of love that's appeared right here in New York City. Join me?
* !!Con will be free to attend, but space will, sadly, be limited, as will the number of talks.
# (2) 22 Dec 2013, 10:42AM: Why Julia Evans's Blog Is So Great:
Some writing is persuasive; it aims to cause you to believe or do something. Some is expository; it aims to cause you to understand something. A lot of tech writing is persuasive or expository.
Some writing is narrative. It aims to cause you to feel or experience something. In personal narrative, the writer shares a personal experience and invites you to walk with her on that journey, experiencing it as she did, emerging with a new perspective. I really like narrative-style tech writing.
What I call the "Amazing Grace" story (previously) is, in a sense, all three of these. "Amazing grace! (how sweet the sound) / That sav'd a wretch like me! / I once was lost, but now am found, / Was blind, but now I see." Or, in more modern terms, "An English Sailor Found Salvation Through This One Weird Trick."
- Exposition: My experience started in sordid terror and ended in divine ecstasy
- Narration: Bask and wonder with me in the intricacy of my journey and the unexpected yet inevitable emergent properties of my condition
- Persuasion: Thus, if you are enthralled to sin, if you are a fallen resident of our fallen world, you should follow my example
I started thinking about this because my Hacker School colleague Julia Evans has a super-engaging blog. During our batch, she dove into operating system internals, and blogged about what she learned and how she learned it. She's consistently inspired me and made me laugh. Two of her fans (fellow HSers) even made a loving Markov-chain tribute, Ulia Ea.
One reason we love it is that most entries narrate her daily learning and illustrate a journey through confusion into wonder. See "Day 37: After 5 days, my OS doesn't crash when I press a key", which is possibly the most "Amazing Grace"-esque of her posts. Excerpt:
5. Press keys. Nothing happens. Hours pass. Realize interrupts are turned off and I need to turn them on....
It's not just the large-scale rhetorical structure; her diction and even her punctuation delight me. I particularly marvelled at her sentences in "Day 43: SOMETHING IS ERASING MY PROGRAM WHILE IT’S RUNNING (oh wait oops)". Excerpt:
12. THE OS IS STILL CRASHING WHEN I PRESS A KEY. This continues for 2 days....
As far as I can tell this is all totally normal and just how OS programming is. Or something. Hopefully by the end of the week I will get past "I can only receive one IRQ" and into "My interrupt handler is the bomb and I can totally write a keyboard driver now"....
I'm seriously amazed that operating systems exist and are available for free.
SURPRISE MY CODE IS NOT WORKING BECAUSE SOMETHING IS ERASING IT.
Can we talk about this?
- I have code
- I can compile my code
- Half of my binary gets overwritten with 0s at runtime. Why. What did I do to deserve this?
- No wonder the order I put the binary in matters.
It is a wonder that this code even runs, man. Man.
The disarmingly informal ALLCAPS adds to the intimacy more explicitly created with the question "Can we talk about this?" which invites the reader into one-on-one conversation. Moreover, I specifically call your attention to the statement "Why." and the repetition "man. Man." They demonstrate how Julia acknowledges mystery, with a tinge of disbelief.
As Patrick Nielsen Hayden observed,
A great deal of science fiction is about what the field's insiders often call "sense of wonder," a quality not entirely unrelated to the good old Romantic Sublime. Many of the genre's classics are in essence carefully-tuned machines designed to attract readers whose primary conscious loyalty is to rationalism, and lead them by a series of plausible contrivances to a sudden crescendo of mystical awe. This is an important part of SF from Olaf Stapledon to William Gibson and beyond.
And Julia Evans.
"I have now discovered that
element.innerText works in Chrome and in Epiphany but not in Firefox."
"This is why you use jQuery."
Some more things I learned:
- Oh right, ordering matters. My
- I was wary of the whole event-handling paradigm but now I'm getting used to it and might like it. Instead of the default idea being "here I am, a script, doin' everything by myself, maybe shoehorn in some interactivity with the user sometimes", the default idea feels like "I'm a set of useful reactions to possible things the user will do".
- I know the windowshade-style
show jQuery functionality is a pretty clear "look! jQuery demo!" signal. And now I know why: because it is cool and easy and just works! Yay
- To get the value of a text
$( "#InputIDName" ).val();
To stick a string into a
I am pretty sure the ".html" method escapes things to keep you from opening up an XSS vuln but I'm not sure and need to check. Argh escaping!
At Hacker School I followed my own advice and found or made up silly and boring and helpful projects to use while learning. My current rhythm seems to be: start by working through the first few chapters of a textbook to learn basic concepts and syntax, then think up a silly project to make and start making it, then run into problems one at a time, causing me to learn idioms and libraries and gotchas from a mix of my colleagues and the Internet. Maybe someday I will come back to chapter three of the book and engage in some more spiral learning! It's nice to have a diversified portfolio.
# 28 Dec 2013, 05:11PM: console.captain'slog:
Thus, I have now watched "WAT" and enjoyed it. And I have made software put "Thank you, and may God bless NoSQL." on a webpage. So that's a good harvest.
- I understand @horse_js better now!
- I sort of understand the differences among Node, npm, ClojureScript, CoffeeScript, and random noun.js files.
- I paired with Tom, one of the facilitators, to port my Obama speech generator to JS. I can successfully write functions (including higher-order functions), randomly choose things using
Math.random, make and use arrays and
Objects, display things in a webpage, write
for loops, get input from the user using
prompt(), and put semicolons everywhere.
- The JSFiddle tutorial didn't mention a necessary step so I added it.
for loop keeps happening until the step when the run condition is no longer satisfied, so it's sort of more like a
while loop? Anyway this tripped me up and gave me an off-by-one error until I grokked it.
show() that give me stark errors in Node.
> 40+["yay", 23]
# 12 Dec 2013, 10:29PM: Hacker School Gets an A on the Bechdel Test:
When part of the joy of a place is that gender doesn't matter, it's hard to write about that joy, because calling attention to gender is the opposite of that. I want to illustrate this facet of my Hacker School experience: mostly, Hacker Schoolers of all genders talk about mostly the same things. And we talk about them in all gender combinations -- including, just by chance, among women.
The "Bechdel Test" asks whether a work of fiction includes at least two women with names who talk to each other about something other than a man. Thus in my blog I have an occasional series listing topics I've discussed with other women. My life passes the Bechdel Test! ;-)
So here is an list of some things I've discussed with Hacker School women. (About half the facilitators, cofounders, participants, and residents are women.)
Some Things Hacker School Women Talk About
- why LVars and set operations relate to current work in distributed systems
- The Kids Are All Right
- IRC etiquette, and when to use IRC instead of a mailing list, videocall or wiki
- the Haiku operating system's key features (many of them similar to BeOS)
- refactoring a function a guy wrote so it doesn't do everything in
main() (technically breaks Bechdel?)
- whether to work at a nonprofit or for-profit
- where is that maple syrup smell coming from? (answer: someone was making oatmeal)
- our GitHub report cards
- how to use machine learning techniques to train a Markov chain to generate funnier sentences
- how the hell Makefiles work
- what the hell a cuticle is
- binary search and Huffman coding
- saving time with useful Python standard library modules (string, time, os, etc.) and packages, e.g., requests
- Too Much Light Makes the Baby Go Blind
pip gets its info (PyPI)
- the Pythonic convention for reading from a file,
with open('file','r') as f, and the fact that it's a context manager
- when and how to use list comprehensions and dictionary comprehensions, generators and decorators,
- why we use
pass for stub functions or classes instead of
- birth control amortization
- how you would override Python's default behavior to raise an exception when slicing a list with a negative int
- how to write a hill-climbing algorithm and why
- G.K. Chesterton's use of the mystery genre
- what the #! (hashbang) line at the beginning of a script actually does
- song currently stuck in one's head ("Gettin' Jiggy Wid' It") and confusing "Wild Wild West" with "Back To The Future III"
- what it takes to work remotely
- security issues inherent in creating a sandboxed version of an interactive Python interpreter
- who put this post-it note on the fridge saying "No Java on Monday"? When? Did the author mean the beverage or the language? Was it descriptive or imperative? Why did they never take it down?
- an awesome 1982 Bell Labs video about UNIX featuring Lorinda Cherry
I could make this list probably ten times longer. My point is, if you don't care about gender, Hacker School is awesome. If you're irritated by the tech industry's usual gender crap, Hacker School is blissfully free of it and you can -- if you want -- turn into someone who doesn't care about gender for three months.
You can apply now for the next batch -- apply by Saturday night, December 14th.
cross-posted to Geek Feminism with a cheesy sketch
# 05 Dec 2013, 08:51AM: Fisher-Price's My First Twitter Bot:
On Sunday I wrote my first Twitter bot, with a bit of help from Leonard. (A Hacker School colleague inferred, understandably, that Leonard and I just write Twitter bots on the weekend, to relax.) Then yesterday I helped a peer get her first Twitter bot going, and decided to write it up for y'all/future Pythonistas.
already written other guides to writing good Twitter bots. Those more experienced people are better than I am at writing good, surprising bots, ones that respond to tiresome hashtags or argue with followers or what have you. My guide, in contrast, assumes that you already have some kind of code that spits out strings of 140 characters or fewer, and just want help interfacing with the Twitter API using Python so a Twitter account can say those tweets.
I am posting this guide on 5 December 2013, and undoubtedly Twitter will revamp all this within the next fortnight and switch to OAuth 2038 (OAuth For Workgroups) and turn their developer site into something you manipulate by holding up your mobile phone in front of your Google Glasses. But for right now this works.
- I rely on the python-twitter wrapper around the Twitter API. If you are using a virtual environment that you made with mkvirtualenv or similar, then activate your venv and
pip install python-twitter
If you live on the edge, or if you are a devil-may-care type who loves convenience so much more than safety that you tab-complete your passwords, then command-tab away from Snapchat* long enough to install the package globally:
sudo pip install python-twitter
- Now, go to Twitter.com and create a Twitter account for your bot, e.g., @FreedomBot. Make sure that you give Twitter an email address whose messages you can actually read. Read the end user license agreement and agree to it. Did you know that there's a hotel in San Francisco named the Eula?
- Look for the automatic email from Twitter and follow the instructions to confirm your account. Once you've done that, go to dev.twitter.com and sign in with the username and password of your bot's Twitter account, e.g., your username is FreedomBot.
- Go to https://dev.twitter.com/apps and choose to create a new app. Yes, your "Markov chain parody of the Office of Management and Budget" script is an "app" for our purposes; this is the easiest way to get the relevant keys.
- Go ahead and fill in a reasonable name, description, and link (your website or Gitorious repo would do). These will not be publicly visible or scrutinized by Twitter's gatekeeper gnomes, so don't sweat it. Leave the "callback URL" field empty. Read the Rules of the Road (so much more folksy than "Guidelines for the Walled Garden") and agree.
- Now that you have an "app", go to the Settings tab and change your app type from Read-only to Read/Write, so you can actually post to Twitter from your script. (If you're having trouble navigating back to your app, go to dev.twitter.com/apps to find it.)
- Next, on your app's Details tab, click the button at the bottom to create your API secret key & token. It'll take a minute for Twitter to create those for you; after a minute, go ahead and refresh the page. Now that page gives you the "consumer key" and "consumer secret" as well as the API "access token key" and "access token secret".
(The "consumer key/secret" pair identifies your "app" (the code you are writing) as something that is allowed to interact with the Twitter API; it's sort of a substitute for the User-Agent string in a browser. The "access token key/secret" pair authorizes the Twitter account whose tweets the "app" is gonna write. So conceivably you could write an app like Sycorax that lets your bot roleplay multiple accounts, and it would end up getting multiple access token key/secret pairs so that could work. This is all OAuth stuff that briefly turns you into an octopus when you understand it.)
- So, how can your script use these secret strings (to authenticate and authorize you) without those supersensitive nuclear launch codes falling into the wrong hands, viz., your GitHub repo? I did it this way:
Open up a new file in the same directory as your bot script, called something like "twitterapi.py". Also add "twitterapi.py" to your repository's .gitignore.
#!/wherever/you/put/python # ok, probably /usr/bin/python
api = twitter.Api(consumer_key="KEY",
So now, in your application or in the Python interpreter, you can do:
from twitterapi import api
- And then, to post your awesome string to Twitter as @FreedomBot:
api.PostUpdate("wow\n so Program Assessment Rating Tool\nnice\n very budgetary")
- Optional step! If you want to leave your bot running to tweet at 5-minute intervals, then
import time and stick a
time.sleep(5*60) line and an
api.PostUpdate(awesomestring) into a
while True loop.
* Is there a Snapchat desktop app? Maybe Snapchat is available for the iPad and in my hypothetical you're programming on your tablet? I don't know what people do these days.
# 23 Nov 2013, 01:59PM: I Cannot Be The First Person To Quip About Quantified Self-Loathing:
After the first week I spent at Hacker School, I worried that I wasn't spending enough time on improving my programming skills. So I started using Project Hamster to track chunks of time that I specifically spent either learning (via coding, pairing, or listening to useful lectures, mostly), versus chunks I spent teaching or helping others.
This past week, I looked at my involvement with Bicho, an open source project that helps people analyze data from bug trackers, and decided there were too many blockers for me to keep on going as I was going. Thriving is a function of a person times their environment, as I learned in my tech management courses, and -- as I wrote in a summary on the metrics-grimoire mailing list -- at my current level of programming proficiency, and given how much refactoring and testing Bicho could use, it's just a bad fit right now. The maintainers responded well, and promise a refactored Bicho is coming, so I hope to restart contributing at some point in the future.
I wondered, after I stopped: how much time had I spent on this project, and what had I learned from it? So I crunched the numbers. Between October 7th and today, I've spent 158 hours on learning activities and 12.9 on teaching/helping activities, which gives me 173.2 hours in total. (I was sometimes rough when inputting my time into Hamster, so take my significant digits with a grain of salt.) Of those, I've spent 55.9 on Bicho, 53.9 on learning and about two on teaching/helping (such as filing bugs and writing that super long email). So that's a little under a third of my Hacker School learning time.
What did I learn? I threw together a rough list:
- read a big giant codebase and understood how parts work together
- got lots more experience with git
- learned how argparse works enough to port something to it from optparse
- used XML-RPC APIs, xmlrpclib
- used launchpadlib module
- learned how packaging works (setup.py, pip)
- grokked __init__ and modules vs packages
- learned some of PEP 8 and used the pep8 and pep8ify modules
- used assert & wrote tests
- used Beautiful Soup
- used some of vars & getattr
- used pdb a little bit
- learned a little bit re Storm and ORMs
- used MySQL & python-MySQLdb
- learned to watch out for old- versus new-style classes
- got comfortable with virtual environments
- wrote a regex
- learned not to put an unquoted argument on the commandline (ampersands!)
That first one is huge. I think it may just take a super long time the first time you try to wrap your head around a codebase fifteen thousand lines long. Then again, now that I've had this experience, I've ordered Michael Feathers's Working Effectively with Legacy Code and may start following Jessica McKellar's advice to Maria Pacana: [Don't] try to understand the whole thing. Understand only as much as you need to know to make the contribution you want to make.
I've now moved on to a different project where I'm making clearer progress, though sometimes it's a slog. In retrospect, I don't really know whether my Bicho work was a good investment of my Hacker School time, or whether I should have stopped a few weeks earlier and learned more and different things. I am trying to remember not to fall prey to the fallacious Fear Of Missing Something. Maybe part of what I learned is a better intuition for "it's time to try a different approach." Argh. So hard, maybe impossible, to assess whether I made good decisions!
# 23 Nov 2013, 11:40AM: How Comprehensive Are Your Unit Tests? Coverage.py Knows:
I've been writing and maintaining unit tests for my project. But only on Thursday did a colleague's presentation remind me that I could run a code coverage tool to check which code paths my tests are or aren't exercising.
I found it super easy to install and run coverage.py, and it only took marginally more fuss to
--omit="~/.virtualenvs/*". The detailed feedback helped me increase my coverage from 70% to 82%; yay! Thanks, Ned Batchelder & other coverage.py contributors.
# 16 Nov 2013, 10:58PM: A Little Design Thinking Can Go A Long Way:
I was playing with stdin/argv because Leonard suggested I improve
Missing from Wikipedia to make it more Unixy and interoperable with other scripts and systems present and future. Right now it demands that you tell it the name of an existing plaintext file as a positional argument. Why shouldn't you be able to generate a giant string of names separated by newlines and just pipe it into the script, as you would into
grep, and similar tools?
I struggled with this whole stdin business, trying to make the tool work with both types of data input, and became disheartened. Then I stepped back to think about what I actually want to do. Aha: I am facing a design decision. I could make different choices that would suit different audiences.
For context: I took a rhetoric class in 1998 and learned the classic Rhetorical Triangle governing any communication. I then misremembered it for more than a decade till I looked it up just now. But I like my version better. So! Sumana's Rhetorical Triangle, as applicable to a piece of political software as it is to an essay, says that if you are trying to communicate with someone, it helps to consider:
My message: some topics have way less coverage on the Wikipedias than they deserve. I feel fine sticking with that. But who are my audiences, and thus which medium should I choose?
If I want terminal-savvy researchers and developers to use this tool, then it's fine as a standalone command-line script. I should stick a
setup.py in there and put it up on PyPI, and switch to an all-stdin model of data input.
If I want activists and less programming-savvy researchers to use it -- people not like me -- then the path gets foggier. I haven't tested this script on a Mac or on Windows; I could work to make sure it's friendly on those OSes, and stay with the simple "gimme a textfile" data workflow. (Why make my user learn to use pipes and
But the much user-friendlier step would be to turn it into a little web app on Tool Labs. My tool would read input from a bunch of formfields and/or allow the user to upload a CSV-type file, and could output to a nice-looking HTML page with redlinks (to help you create the pages) with options for plaintext or wiki markup download. This would also make the tool a lot more discoverable by casual websurfers. And if I put it on Tool Labs, I can run queries directly against live replicas of the Wikimedia databases, which would be faster than hitting a web API.
I imagine some folks, who like great UI and more seamless data transfer, would prefer installable desktop/mobile applications with actual GUIs. But I have approximately no skills in that area and feel very little urgency about growing said skills, so I won't be going in that direction.
Once I framed my data flow problem more as a product management question and less as an implementation struggle, I found it much easier to decide. I can serve the audience that needs this tool -- activists and researchers -- while still retaining value for those with more comfort on the command line. It would be feasible to refactor the tool into:
And I've not yet implemented a web app that takes input from a user and spits out a relevant response, so I could do that and become a cleverer programmer, or borrow code that does most of what I want.
- a core module that takes a bunch of names, checks them against a Wikipedia, and spits out a "missing" list (you could run this as a standalone command-line script, getting data from stdin)
- a set of web-specific functions that make it easier to get input and excrete output
The simplification that makes me sigh in relief: I won't write and maintain two kinda-clashing methods of data input. (Although the tradeoff is a bunch of (arguably) feature creep.)
# 16 Nov 2013, 09:45PM: Accidental Quine:
On Friday, while trying to work with standard input (stdin) and command-line arguments (argv), I accidentally wrote an almost-quine (a program that produces its own source code as output). I've removed a few debugging print lines, unused functions, etc. to give you this cleaned-up version:
$ ./script.py testfile.txt
b = sys.argv
if len(b) > 0:
with open(b, 'r') as f:
filedata = f.read()
if __name__ == '__main__':
Explanation: I meant to have script.py grab the first argument to script.py, assume it was a file, and open and print it. However, I failed to actually check the behavior of
sys.argv ahead of time; turns out that the actual first item in
sys.argv is, in this case, "script.py", not "testfile.txt". You can try this out yourself, and verify that you'll get the same output whether or not you include testfile.txt as an argument. Off-by-one error. I should have had the
with open(b, 'r') bit try to
open(b, 'r') instead.
Reading a file is cheating in real quine competitions. But I still found this pretty funny.
# 15 Nov 2013, 03:44PM: Code4Lib, Open Data, Open Access, and Fighting Systemic Bias:
"Missing from Wikipedia" (code) makes me happy. I presented about it yesterday at Hacker School, asked a fellow HSer to discuss his critique of my code, and - live! on stage! - merged his pull request. Yay for code review and collaboration! (I also showed off a much sillier toy I made, which grabs some sentence from an English Wikipedia page if you give it a topic. Sample for "Chairs": "Some are decorative.")
I am grateful and proud that I can, with "Missing from Wikipedia," make a small contribution to the ecology of openly licensed code and content that I draw from. I could make "Missing from Wikipedia" because:
And so on. I fork from the repos of giants.
- the data for all Wikimedia projects is available under an open content license
- and queryable via an open-to-all API
- that lets you get information about 50 pages at a time (and with not-too-terrible rate limiting)
- that I could access using a good open source library with great docs
- available for an excellent and well-documented open source programming language
- that already Just Works with my source control system, text editor, operating system, and laptop
But we can only use a tool like "Missing from Wikipedia" if we have data to feed into it: a list of names. This is another way open data and open access to research is important. If we can get digital copies of things like the tables of contents of other encyclopedias and dictionaries, that makes it easier for us to systematically check for missing coverage on Wikipedia. But if those lists and tables are behind paywalls, then we can't see them.
And we need access to research papers, to help us figure out what tools to write. Let's say you'd like to fight systemic bias on Wikipedia and you want to write the most effective tool you can. What proportion of these citations on the effect of sexist language can you read & assess yourself? What proportion of the research that would help you do your job better is behind a paywall, and therefore not just hard to find, but essentially undiscoverable? Papers you can't link to are like missing Wikipedia articles -- out of sight, out of mind, out of the group discourse.
At this point I wave my hands excitedly and go off in some direction expounding on the intersection of open stuff (especially Wikimedia), social justice, comedy, and transformation. I presume I will cover similar topics in March 2014 when I keynote the Code4Lib conference, speaking to people who make things for/with cultural institutions. (Such an honor to be asked to keynote Code4Lib! And with Val Aurora of The Ada Initiative giving the other keynote!)
I've benefited so much from the ecology of open stuff. I aim to reciprocate, and to help make it even better.
# 13 Nov 2013, 08:38AM: Missing From Wikipedia: Tool to Help Fight Systemic Bias:
This week I wrote a tool I currently call "missing from Wikipedia" although the name may change. You feed it a list of people's names and the language Wikipedia you want to check, and it tells you who from that list does not currently have Wikipedia pages about them.
For instance, I gave it the ~2100 names from the table of contents from the Oxford Dictionary of African Biography (edited by Emmanuel K. Akyeampong and Henry Louis Gates), and asked about English Wikipedia. The list of people who (I think) do not have enwiki articles about them has 948 names. That means we do cover about half those Africans already, e.g., Nadine Gordimer. (This is an approximation, because I know some names need more finagling; for instance, currently the script messes up Barack Obama Sr.'s name so it wrongly thinks he doesn't have an enwiki page about him.)
I wrote this for Keilana (yay) as a tool to help fight systemic bias on Wikimedia projects. I hope other people find it useful. I've just added some code so that it prints out the percentage of missing people when it's done running, so you have a better measure of (for instance) French Wikipedia's coverage of important Senegalese leaders. I met Keilana in Berlin this past weekend at the Wikimedia Diversity Conference, and got to show her the power of APIs.
When I came to Hacker School, I had a general goal: "When I see a problem that could be solved by writing some Python and reading from/writing to an existing API, I want to recognize that and be able to solve the problem that way." Now I'm a little over halfway through and I have done it!
The code's GPL'd. Enjoy.
# (5) 06 Nov 2013, 08:06AM: Top, Iterators and Generators, and Git, Emacs, and REPL Tips:
Dumping into a post some things I've learned recently, trying to disregard the potential "you didn't know that already?!?!" surprise, feigned or genuine, that people might impose on me.*
* The magic of Hacker School: no one at Hacker School will do that. Nor well-actually me about this post! Random internet commenters might, and I may delete them.
- How did I never use top before? Magic! "Why in the world is my fan so loud? [run top] Epiphany, I closed that tab minutes ago, why are you still going like gangbusters? Fine, I'll quit and restart you."
- Lots of data types in Python are iterables. Like, say, lists, or strings. If you call the iter method with an object of that type as the argument, you get an iterator -- if you want to do stuff with that, then you give it a name. An iterator (holy crap) is like a function that holds onto state, so that it remembers what its state was the last time you accessed it! The point of an iterator is to traverse the iterable from beginning to end, yielding one value each time it's called with .next() or similar, then saying all done with a StopIteration error. Like this:
>>> a = [3,6,9]
[3, 6, 9]
<listiterator object at 0x7f5c8c8da490>
>>> s = iter(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
>>> r = "captain"
>>> w = iter(r)
- If the body of a Python function includes the verb yield (instead of return), then you've just made a generator. A generator creates an iterator that performs your whims! Again, you don't just call it directly; you assign a variable to a run of the generator function, with the same syntax as you'd use if you wanted to make an instance of a class, and then you have a generator object, which is an iterator that you treat as you would another iterator. Lemme show you:
>>> def foo():
... yield "first"
... yield "second"
... yield "last"
>>> b = foo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
I read (bits of) so very very many pages about this, and fellow students tried to help me get this (thank you, Joe and Gideon), and then yesterday I paired with Jessica McKellar and she sealed the deal and I think I get it now! As she explained, you might want to use generators if you want to get an infinite series of values (e.g., all the even numbers up to infinity). Or if you're crunching numbers and it takes hella resources to do this particular part of the crunching on THE WHOLE DATASET ALL AT ONCE, with generators you can just crunch one input at a time and yield it up, then move to the next input in the sequence seamlessly when needed. You can speed up bottlenecks in your assembly line by doing particular computations in a just-in-time way.
- Meta-g g in Emacs takes me to a specific line number in the file. Putting (setq column-number-mode t) in ~/.emacs.d/init.el ensures that the statusbar at the bottom of the editor displays column number along with line number. These tips together make it much easier for me to seek out whatever discrepancies git or pep8 have brought to my attention.
- The xmlrpclib module makes it pretty easy to access XML-RPC web APIs, e.g. the Trac API as accessible on the Django project's site. However! The IPython and bpython REPLs may attempt to nicely autocomplete not-really-discoverable method names ... across the network ... and choke. And maybe crash. So if you want to play with it, just use the regular Python REPL. (But for everything else, oh wow, the bpython REPL is pretty snazzy.)
- git grep is great! It's automatically recursive, and only searches "the tracked files in the work tree, blobs registered in the index file, or blobs in given tree objects" (quoting from the man page). Just like with grep, if you use -n, then with every matching line you also get the line number. Or set lineNumber = True in the [grep] section of ~/.gitconfig to always have that on. If you miss colored output, use the --color=always option, or (as I just discovered) you should check out git configuration options, e.g. color.ui=true, to make LOTS OF OUTPUT colored and useful!
# (4) 04 Nov 2013, 09:30AM: Comprehensions:
I spent a bunch of September in San Francisco, trying to tie up loose ends at work so I could go on my sabbatical with a free heart. My notebook says things like:
"30 is a large #" -- why? context
While there, I finally went shopping with Val and bought some new sneakers, so I could throw away my ratty old sneakers. I'd bought them in a fit of exercise-related optimism about seven years prior. I find it easier to buy clothes and shoes in other cities. I'm already off-kilter, disequilibrated, so why not add one more change, get one more bit of anxiety over with?
explain briefly when to use test 2 vs beta cluster
Say there will be 4 types of failures, then give numbers as you go
And during that trip, I went one step further: I went to a salon and got my hair dyed blue, like I'd wanted to for years. The dark blue only looks obvious in bright light, so people at work did double-takes, checking that their eyes' photoreceptors hadn't fritzed out. I'd never done anything that chemical to my hair before. I hadn't wanted to sadden my mom.
I got to Hacker School on September 30th and found out I was one of two women with blue hair. (We discovered quickly that we have a few mutual friends.)
The weather got cooler and cooler as we eased into our term and found our rhythms. The library got more books as people donated or lent them to the school; now there are huge gaps on the shelves as the books migrate to work tables. The kitchen has accumulated several different coffee-making gadgets, about ten containers of communal tea, and a steadily increasing stack of leftover paper napkins from takeout lunches. Most people sit in the same place every day now, as far as I can tell. Some prefer the beanbags, some the conference room with plenty of sunlight, some the standing desks, some the ABSOLUTELY NO TALKING quiet room, some the rooms with whiteboards, some the shared tables. I try to move around a lot.
For the first few weeks of Hacker School, I consciously basked in the number, diversity, and quality of the women in my batch. As the folks who run HS recently blogged, 42% of our batch of 59 are women. I look around the room and our chat channels and I see people helping and being helped, within and across genders. After the first week, I still hadn't learned all the women's names! Now I'm nearly used to the gender balance, but those first few weeks disoriented me in a good way, to tell the truth, and visiting non-HS physical and online spaces disorients me back. From the HS blog post:
One of the many benefits of having a gender-balanced environment is that, at least within the confines of Hacker School, the pressure to represent or focus on "women in programming" largely fades away, and people are free to focus on programming rather than rehashing tired arguments.
Focus on becoming better programmers: our guiding star. We try to avoid distraction (one guy said his phone battery lasts longer these days). But I feel guilt for enjoying our oasis and concentrating on myself, when I have so many sisters outside, wishing and working for environments a tenth as nurturing as Hacker School is.
But I have to focus on my own transformation right now, letting this experience change me, so I can go carry that transformation elsewhere.
I take a walk most days. I'd never spent much time in the Soho/TriBeCa region before, and now I'm getting used to the tiny blocks and the tourists shopping for knockoffs on Canal. The other day I saw, in my meandering, a shop window advertising "Maps and Dictionaries," which amused me, because I've been improving my fluency in Python maps and dictionaries, and generally grokking things like data structures and lambdas and whatnot.
It's heady stuff.
Yes, I like grabbing data from APIs and munging it, and I chortle when I can make the command line do new tricks. But oh wow, functional programming and hash tables make me clutch my head and shout superlatives and profanities. I'm beginning to get how mild-mannered programmers can turn into complete zealots about things like functional programming and structured data. Oh, who am I kidding -- I already thought I understood how people could do that, just for something to believe in, but now I see how I could turn into one of those evangelists, if this were the only revelation I'd ever had or thought I'd have.
My notes from the past five weeks include far less "tell $person about $thing" than usual:
Went to Python "office hours," learned stuff re setuptools & pip & virtualenv, and started Flask tutorial - got to Hello World, then step 2. Emacs improvements....
Stopped when angry/tired, wrote down summary, got beer, got Joe, figured out was editing file that was not getting run (venv), started getting stuck in dependency hell (mysql?!) when checking whether problem was BZ-specific. Stopped for the day....
Some transformations make us over all at once, the same function applied uniformly to every element in a collection, from black hair to blue in an afternoon. Some happen to parts of us first, before other parts catch up, eventually consistent. I'd been programming for a long, long time before I called myself a programmer. I can't tell whether I feel arrived yet, whether I feel home. (We talk about progression in time as though it is progression in space, don't we? As though our lives are journeys, as though our schoolteachers are packing our saddlebags, as though a calendar is a map of time.)
Last week, Leonard and Beth made brownies with marshmallows and M&Ms. I taught a few peers at Hacker School to play Once Upon A Time. Leonard and I watched "Wives", a feminist Norwegian seventies film. I learned lots of little things about zip, map, filter, reduce, databases, packaging, bpython, bash. I dressed up as "Futuristic Businesswoman Sumana" for Hallowe'en, in my green business suit that looks vaguely Vulcan (lapels are illogical). I got to question 11 in Python Challenge. I'm in the middle of reading about eight books. The dead leaves started piling up on the sidewalk, fun to crunch through, and the autumn rain started, although Saturday the sun stayed out. I walked to the theater and thought, it won't be this warm again for five months.
Every few days I remember that Aaron is still dead. And I think I dreamt about my dad a few times in October; in one dream I got confused, thinking, "wait, I thought he died already, how could he be dying again?" but that's something you don't say to the rest of your family, or at least something I don't say. I think I've gotten to the long prairie of life where I'll be going to more funerals than weddings from here on out.
In September, in San Francisco, a colleague asked me: why all these changes all of a sudden? The sabbatical, the hair, the shoes? And I asked whether she remembered Aaron Swartz. She hadn't known him, but she remembered the public mourning of his death. I told her what he'd said, the revolution will be A/B tested, and explained what he'd meant. We activists have a responsibility to use our energy well. I, in particular, believe I need to become a better software engineer so I can be a better social engineer. So, I told her, I drew two relevant lessons from Aaron's death:
- Life is short, so be a better activist.
- Life is short, so do small harmless things that make you happy.
Today I'll put on those new shoes and go to Hacker School, and drink tea, and learn from women and men some new thing that makes me swear aloud, that will help me fight. Everything that lives changes; the only way to stop changing is to die. If I find myself afraid of growing, I'll remember all the forces that don't want me to learn. Death being only one of them.
# 01 Nov 2013, 11:06AM: PEP 8 Compliance:
It's easier to read and contribute to code when it's stylistically consistent. This is a reason why we have PEP 8, the style guide for Python code. It says things like:
Avoid extraneous whitespace in the following situations:...
Certainly most of the Python code I run across follows that convention. So I got confused when I read Bicho code that sometimes had extraneous whitespace between function name and arguments. Sometimes it did and sometimes it didn't. From a note by one of the maintainers I inferred that Bicho's developers want code to comply with PEP 8.
Immediately before the open parenthesis that starts the argument list of a function call:
No: spam (1)
So I decided to look for those discrepancies, so I could fix them. You can use the
pep8 module to find instances of PEP 8 noncompliance, and you can give it arguments to narrow down to just one issue. A command like
$ python pep8.py --select=E211 Bicho/
gave me the list of lines with extraneous whitespace before the open parenthesis. (I've edited out some path-related cruft.) I thought I'd write a regex to fix those lines, but Julia and Leah kindly talked me into seeking out a pre-existing tool first, and I found Pep8ify.
$ pep8ify -f whitespace_before_parameters Bicho/
gave me the proposed fixes as readymade diffs. To make Pep8ify do those fixes:
$ pep8ify -f whitespace_before_parameters Bicho/ -w Bicho/
So now I've filed an issue with a pull request. (I also used Pep8ify to clean up some whitespace inconsistencies around operators like "+" and "=" while I was at it.)
Thanks to Szymon Guz's blog post for pointing me in the right directions.
# (1) 28 Oct 2013, 11:59PM: On Ability:
Someone discovered "that the addition of 'Harry' to almost any Plato quote makes it seem legitimately like a nugget of wisdom out of the mouth of Albus Dumbledore." This reminded me to look up my favorite Dumbledore quote:
It is our choices, Harry, that show what we truly are, far more than our abilities.
I am trying to remember that, because every day I go to Hacker School and sit next to people with lots more programming skill than me, and sometimes I find that discouraging. Or I realize how badly I want to impress people, to feel admired and respected, and how that sometimes gets in the way of growing and achieving actually admirable, respect-worthy things. I need to remember to disregard that kind of anxiety fungus emotion. Thomas Beagle said in some related comments:
to be a good geek you [have] to have both humility and arrogance in equal measures. The humility was so you'd admit you didn't know something and get help/read the docs/etc., the arrogance was the bit that said "I don't know that now... but I can and I will soon."
I think that, like a lot of people, I conflate skill and confidence, and I need to disassemble a construct I didn't even realize I had in my mental infrastructure. How slippery, that the confidence I need to develop is the confidence to express uncertainty, to say "I don't understand" as many times as it takes. Our Hacker School facilitators guide us to try projects that intimidate and scare us. Truly being vulnerable to my own ignorance is on that list. I wish I knew how to credibly and persistently promise myself that the rewards from being open to change are greater than the return on inertia.
# (2) 26 Oct 2013, 05:07PM: Some Artifacts:
At Hacker School, I'm working on little projects to teach myself various things. I am following my own advice by embracing silliness. A few things I have made, all of which now have code up on GitHub:
An Obama speech generator. I wrote this command-line speech generator just after Barack Obama gave a televised speech partially about trouble with HealthCare.gov. I thought, "what if Barack Obama gave LOTS of speeches about tech?"
Also, I wanted to try out test-driven development, so speech-tests.py has the tests I wrote (using Python's unittest module) before or as I wrote functionality.
So, here you go. Run speech.py at a command line and type in three tech buzzwords when asked. (Alphabetical characters and spaces work, but no other punctuation -- "the cloud" and "NoSQL" are fine, but "object-oriented" won't work.) You'll then get a short speech incorporating at least one of the buzzwords you've provided. This gets Leonard to laugh a lot at lines like "Thank you, and may God bless Agile."
"Personality Rights", a super-short and moody game. I made it in Ren'Py, a platform for making "visual novel"-type games, which meant I fiddled with a sort of domain-specific subset of Python to specify plot, characters, styling, music, and graphics. (Graphics include the turtle image at right.) I wrote it in three hours because that's the time limit for the Ectocomp game competition.
A scifi novel title generator (previously).
Improvements to Bicho, although I backed off my giant "refactor and add tests to everything" plan in favor of learning to write tests first.
(I wish I were using Gitorious instead of the closed-source GitHub, but Gitorious wouldn't let me log in, even after password reset. Bleah!)
What have I learned? I only have eight weeks left. I'm better than I was at Python, emacs, git, and bash. I have gotten to question ten on Python Challenge, and finished all of CodingBat Python.
Sometimes I have a reaction I don't like. When I learn some new amazing thing you can do with Python, I get angry that I didn't know it already. It's a very fixed-model way to react, not a growth-model way, and I think I have to fix more things in me before that stops happening.
# (1) 26 Oct 2013, 12:42PM: Hacking Music:
I spent a day at Hacker School without a working pair of headphones recently. So I spent about three hours pairing, which I loved and which I'm going to repeat every day that I can. But I also booked a room to play music in and invited fellow participants to come in and code while listening to music together for an hour. I was just going to play some video game soundtracks off my Nokia N9, but Andrew showed up with his massively multiplayer music player and the speakers we use for parties. I know when I'm outclassed.
When I ask my colleagues what they like to hear while coding, we mostly agree: "nothing with words I can understand." So, here, monolingual English speakers have a big advantage! Some albums I like:
And then there are the songs I listen to on the subway when I am discouraged from bughunting and troubleshooting. These are my "Perseverance!" songs.
- Greatest Marches (various composers and performers). The original techno music!
- Beirut, The Gulag Orkestar
- Eric Skiff, Resistor Anthems
- Wendy Carlos, Tron soundtrack (original Tron)
- Daft Punk, Tron:Legacy soundtrack (like everyone else, I also enjoy the "Reconfigured" remix album)
- El Ultimo Skalon, Ciudadano del mundo
- Grand Valley State University New Music Ensemble, In C Remixed
- "Tubthumping" (They Might Be Giants version)
- "Give Paris One More Chance" by Jonathan Richman
- "Be Born" by Tally Hall
- "The Sadder but Wiser Girl" from The Music Man soundtrack
- "This Too Shall Pass" by OK Go
- "Smells Like Teen Spirit" by Nirvana
- "Get Around" by Leonard Richardson
- "Better Times are Coming" by Kate and Anna McGarrigle, from "Songs of the Civil War"
- "You've Got To Do It" (lyrics and tune by Mr. Fred Rogers, interpretation by Holly Yarbrough in Mr. Rogers Swings!). Some lyrics I have listened to over and over:
If you want to ride a bicycle and ride it straight and tall
You can't just sit and look at it, 'cause it won't move at all
It's you who has to try it
And it's you who has to fall
If you want to ride a bicycle and ride it straight and tall
# 22 Oct 2013, 08:25AM: OMG, or, Biting Off More Than I Can Chew:
So, there are random non-programming reasons why I didn't feel like I made much progress yesterday -- I tried working in a beanbag chair (no good for Sumana), I drank coffee too early in the day (caffeine crash in the afternoon) and I listened to the wrong music (Guster and Neutral Milk Hotel? might as well be The Mountain Goats for all the good that does me) -- but here are the big ones.
Today is a new day! As soon as I hit Publish I will pair with Leonard a little. Yay morning freshness.
- When I got frustrated, I didn't get help as much and as often as I should have. This is DESPITE the fact that I DID get help from Zach, Allison, Stew, and, crucially, Mary. I had short conversations with the first three, and then Mary paired with me for 45 minutes and helped me understand some things about my goal of refactoring a big codebase. Such as "it is a big job" and "you should have a guiding thing you're trying to do as you go, like adding a new feature, or adding tests". Also she helped me see how to write a test with little mocks so that I'm not doing the foolish thing of testing "does the exact same technique (as the function I'm testing) work on the exact same input (as the function I'm testing is using) and get me the same output (ibid.)?"
- When I was trying to play with Beautiful Soup for the first time, I used a test page different from the test page in the documentation/tutorial. Then, when I ran into seemingly inexplicable errors, I didn't think to look at the differences between my input data and the input data in the documentation examples. I felt especially helpless because my spouse wrote Beautiful Soup and thus it seemed like I ought to wait till I got home and ask him for help. But that broke my momentum; I should have swallowed my embarrassment and pride, and asked someone at HS for help.
- To feel less like a broken and incompetent programmer, I did some simple Codecademy exercises, but the interface feels slow and -- since I already know the concepts in the exercises I was doing -- I knew I wasn't really learning anything. I should have skipped them, or just skipped to something hard.
- I paired with Julia to understand some gzip stuff to hopefully help her debug a problem, which means I learned some things about Huffman coding and the mind-breaking way that LZ77 encoding and decoding works. But it took what felt like a super long time because I never learned some basic CS things and I am not facile with binary arithmetic, and I felt like a drag, and felt blergh. Maybe I was just in a down mood; maybe I should have just bid Julia goodbye and gone for a walk or something. (Sorry for my raincloud, Julia!)
- We heard a talk (in our Monday night talk series) that somewhat went over my head, and which, as I realized about 25 minutes through, was simply going to be hard for someone with my learning style. I an an active enough learner that hearing about concepts like event loops and threads (and the problems concomitant with those approaches to concurrency) isn't enough; I need to play with them and experience them. I am a visual enough learner that, if I haven't tried writing concurrent programs before, I need diagrams or animations or similar visual elements to help me understand what works and doesn't work, rather than just sentences spoken orally. I am a sequential enough learner that if I don't get the first concepts in a presentation, it's going to be hard for me to grok any of the middle. And I am a sensing enough learner that I really need to understand the examples, and I had a hard time reading the syntax of an unfamiliar language (OCaml) to get at what was happening in the examples. (This is interesting data because I thought I was way more on the intuitive side of the sensing-intuitive spectrum, and the verbal side of the visual-verbal spectrum, and the reflective side of the active-reflective spectrum. Failures show us nuance!) So, for those first 25 minutes, I mostly felt unintelligent, and tried to follow, and felt my morale sag. After I realized "oh, this is almost exactly the opposite of my learning style", I felt less bad. So in the future I shall try to have that realization earlier. (Monday night talks are lids-down, so it's bad manners to try to understand the speaker by writing code. Even though I consider myself a crap artist, maybe I should try getting out pen and notebook to draw my own diagrams in cases like this.)
- After I came home, Leonard worked with me on Beautiful Soup play and helped me understand another perspective: writing tests is writing code, and writing tests is hard -- in fact, generally harder than writing the code that the tests test. You have to think on another plane. Oh. I've been blithely walking into intensely difficult non-solved-problem areas of software engineering, viz., refactoring and testing, with a codebase entirely new to me, as a not-very-experienced programmer. No wonder I've been running into difficulty.
# 18 Oct 2013, 11:17PM: A Mediocre Day But A Good Week:
Mel Chua visited Hacker School last week and especially entreatied us to blog about bad days, days we felt demoralized or unproductive. It helps her with her research. Well, Mel, here you go.
It's a Friday, and we get Fridays "off", that is, we don't have daily checkins and the facilitators don't have bookable office hours and some of my colleagues are gone and it feels muted and off-kilter. On September 27th I'd thought that I'd be working four days a week, but I look at colleagues who come in every single day, including weekends, and feel FOMO. Fear Of Missing Out. So I've started coming in for half a day on Fridays, but it rarely feels as good and productive. I think by the end of Thursday I could really use a break to recharge. So that's one thing. I think from now on, if I come in on Fridays, it should be to accomplish a very specific task, and I should leave when that task is done or when it becomes clear that my energy or cognition is flagging.
Last week some of us joked around that we should do the opposite of Casual Friday: Fancy Friday we dubbed it. So today I came in wearing a dress, a bit low-cut at that. But I didn't see anyone else in suits or gowns or similar, so I didn't feel as comfortable in what I was wearing. For next week, if I come in on Friday, I may try a pantsuit.
I arrived around lunch time I and brought my lunch, which is good because it makes me feel good to be frugal. But I read a depressing and not particularly edifying message board as I ate, and I should probably save that kind of thing for the weekends or my couch at home.
I started off my work with the vague goal of "learn about unit testing" and it took me a lot more reading, sighing, and moping time than it should have for me to ask for help (thank you, Ryan, for our impromptu chat in the kitchen that led me to understand when to use assertions and when to go for mocks, and thank you to HSers who chatted with me about Mock) and to reduce my aims to something more manageable. Next time: follow my own advice, and ask for help after fifteen minutes of feeling stuck. Also, "learn about x" is an okay way to start surveying the problem space and the solution space, but "try a single implementation/example of x in a toy app" is a much better goal for an afternoon.
I drank coffee when I should have had water. I ate licorice when I should have snacked on edamame. So I got jittery and sugar-crashy instead of focused.
I chatted with someone, and I was judgy or negative when I could have been more thoughtful and constructive.
I helped people with git problems and questions, which I'm glad about, but I missed the opportunity to ask them about their learning styles first and organize my thoughts a little accordingly. It felt haphazard.
It took me way too long to start listening to energetic productivity-provoking music on my music player; maybe I'll just set a reminder to make that happen around 11am every weekday.
And there's random other stuff on my mind, e.g., having to grab my old mail off the OCF's servers by Sunday.
So, Mel, overall, today I started off in a low-energy, non-driven mood, and I didn't take the kinds of steps I know I oughta take in order to fix it. But nearly every time I spoke with someone, it shook me out of my rut, and helped me gather the activation energy to do The Right Next Thing. So it could be that Fridays I should just try to arrive in the morning and set up a pact with a few colleagues to do a check-in conversation every 30 minutes. (It's easier to set up that sort of thing upon shared arrival in the morning.)
Hope that helps you. It helped me.
What did I accomplish today? I implemented a few docstrings and started learning how to use __repr__. I showed some people how to work with branches and multiple remotes in git, and how to fork and make pull requests on GitHub. I reported a few bugs in one product and made a pull request for another. I used "git cherry-pick" for the first time, with Alan's guidance. I wrote most of a test that uses "assert" to check that there's a path from Independence to Portland given the links between cities in my game. I got emacs to give me two side-by-side buffers. 3.7 hours tracked in Project Hamster, 2.9 of learning and .8 of teaching, plus a few more of faffing about on the net or in conversation. But I surpassed my 20-hour learning goal for the week -- I'm around 25 -- so this week overall I've done well.
In retrospect, today felt suboptimal in contrast to a usual Hacker School day. Today I plateaued. I think after some rest this weekend I'll plunge in fresh on Monday with clear goals and better discipline.
# 16 Oct 2013, 11:19PM: Idiosyncratic Troubleshooting Tips:
Yesterday I tried to diagnose and fix a bug in an open source project. I got discouraged because of a few factors, so I'm noting down a few things I ran into, for future Sumana and other similar folks.
Thanks to Joe, Fei, Rupa, Allison, Travis, Moshe, Kat, Julia, and the Bicho developers for their help the past few days!
- Are you editing the right file? If you're in a virtual environment, make extra special sure that the file you're trying to tweak is the same one listed in the traceback.
- Special characters? For instance, bash might get all weird on you if there's an ampersand (&) in an argument you're passing via the command line.
- pdb, assertions, and IPython. I'm working in Python, and I've started to learn to use "python -i", the Python debugger, "assert" tests, and the IPython toolkit. IPython especially is cool because the visual presentation of the stack trace is easier to follow.
- The database setup toolchain is blergh but worth it. If a project needs a MySQL database set up, then fine. There's a little bit of dependency hell but it's not intractable, especially if you have someone nearby who's done it before. What all did I have to do? I can't retrace the order, but from looking at my .bash_history, Synaptic history, and dpkg.log:
And then, you know, you have to do the initial privilege-setting and connection-making, and probably create a database, blah blah blah. But! Eventually it works and you can alter and create and drop things like it's going out of style. (Which it probably is, memory bank fashion going the way it is.) And it does eventually work, and the stack of dependencies doesn't REALLY take up loads of disk space the way it feels like it will.
- apt-get install python-dev
- apt-get install python-mysqldb
- apt-get install mysql-server
- apt-get install libmysqlclient-dev
- apt-get install mysql-common
- pip install MySQL-python
- pip install python-MySQLdb (I think?)
- Branch! Not quite as relevant, but: just get into the habit of proper git hygiene when working on improving a shared codebase, e.g., switch to a new branch for a new logical set of changes. It makes merge requests/pull requests so much more frictionless. And then "git checkout -" makes it super easy to switch between the branch you're on and the last branch you were working on.
# 09 Oct 2013, 10:43PM: Programming Jokes:
The Hacker School application form asks you to provide some code you've written so the faculty can look at it. I wrote a game: "Where on the Oregon Trail is Carmen Sandiego?" It is a joke of a game and a platform for further jokes. During my first week at Hacker School, I improved my programming skills by improving it. For instance, now multiple villains might have stolen that wagon tongue, including Waldo.
Kat Walsh encouraged me to actually implement the joke I made in August. So I am now working on a toy web app (using Flask) to grab physics article titles from English Wikipedia (via the MediaWiki API, via Pywikibot) and perform Queneau assembly on them to make plausible scifi novel titles, and then display those strings on a web page. So far, fun titles have included:
They make people laugh. With software, I can scale my comedy! I can make more people laugh at more things. I think we could get more people programming if we showed comedians that you can pull better pranks if you can code.
- Optical Reluctance
- Hazard Steel
- Electrodynamic Hackerman
- Joule 1584
- Choke River
- Nernst Hopping
- Source Cloaking
- Joule Summation
- Waveguide Bearing
- Capacitance Torus
- Ionic Agent
- Tunnel Curve
# 06 Oct 2013, 12:00PM: What They Don't Know:
Or: you are an expert if you can save people time.
Late in 2011, I found out that one of my colleagues, a whip-smart and infinitely organized administrator, wanted to know more about how the engineering side of Wikimedia works. So I started teaching her. Every month, we talked for about an hour. She asked me about some activity from the monthly report and I explained what we're doing and why, often using analogies. She loved it and felt far more connected to what her other colleagues were doing.
She's not at Wikimedia anymore, so I have tried doing it as a Wikimania presentation and continuing the tradition with other WMFers who were interested. So far I've done a lot of one-off "What the fudge does Wikimedia engineering do" sessions for incoming folks, mostly non-engineers coming into the Foundation's other departments.
Two lessons from that experience:
- Sure, continuing mentorship relationships are awesome. But don't discount the value of a few limited teaching sessions.
- I have about three approaches to teaching this stuff: Historical (What has happened since we started in 2001?), Experiential (What happens under the hood when you go to en.wikipedia.org in your browser, and who's in charge of what parts?), and Organizational (Who are the eight directorates in WMF engineering, and who are other important Wikimedia tech institutions, and who does what?). I want to get better at the historical mode, which means learning what happened in what order between 2001 and 2011; right now I do the org-chart mode quite well, and the experiential mode well except for talking about load-balancing and caching.
I wish I'd kept good notes of all the questions people have asked during these sessions. Some of them:
- What is a parser?
- What is LAMP?
- What is MySQL? What is a database?
- What is Apache?
- What does "open source" really mean?
- How can it be that so many talented programmers are only in their twenties?
- What is the role of the Engineering Community Team?
- What do the people in the MediaWiki core team do?
- What is Subversion, what is Git, and why did we switch?
- Do all the Wikimedia sites run on MediaWiki?
- How can we do what we do with so little staff?
- What's with this Lua thing?
- Why has it taken so long to write the Visual Editor? (This question led me to sketch out a blog post we published.)
- What is a "virtualized hosted development environment" (Labs)?
- Why did we have to switch to IPv6 and why was that hard?
- What is an API?
- What is HipHop and why would we use it?
- Why did we work on a specialized Wiki Loves Monuments app?
- What are the Universal Language Selector and Milkshake?
- What is Swift?
- What is the difference between the E2 (Editor Engagement) and E3 (Editor Engagement Experiments) teams, and what do they do? (We partially fixed this by rearranging and renaming the teams to Core and Growth.)
- What is HTTPS? How does SSL work?
- What kind of security problems could a web-based application have? Do we have worse problems because we're open source?
- What does a product manager do?
- Why don't we provide automatic translation from and to different language Wikipedias?
- What is Wikidata?
- Is Wikipedia Zero just for Wikipedia, or also the sibling sites?
- How do we consult with hundreds of different wiki communities when building and rolling out our software, especially when we don't speak their language?
I have just started at Hacker School, a place designed to help everyone learn. That means making people feel comfortable with saying "I don't know". I've benefited countless times from this, because if no one's going to belittle me for not knowing something, I feel safer asking and learning. I didn't realize how much I would also get to teach! When everyone feels safe saying "What does that mean?" then I get to help more people learn more things. I've explained, among other things:
It's super amazing when you teach someone a skill or a perspective that changes them. I feel so lucky that I am an expert, i.e., someone who can save other people time. It is a form of hospitality.
- what Markdown is, and why you would use it
- what screen-scraping is, and why APIs would be better
- how I use git
- what unit tests are, and why you would add automated testing to your project
- dozens of opportunities to reuse, integrate with, or improve Wikimedia data and software
- a bunch of Unix command-line tips, such as control-R for interactive search of bash history
- the sordid history of ReiserFS
- why Nvidia drivers are the classic example of "proprietary stuff that's not in the Linux kernel but that you might want to use so some distributions carry it"
# (5) 28 Aug 2013, 01:05PM: Accepted To Hacker School:
If you've read my past posts on Geek Feminism, you've seen me thinking about how I learn. I have worked 15 years in the software industry as a tech writer/salesperson/tester/manager. And I have about 25 years of occasional BASIC/Scheme/SQL/Visual Basic/Python/CSS/bash under my belt. I've enjoyed dabbling; I love solving problems with code and the "it works!" feeling of making something that does my bidding. And, thanks especially to AdaCamp, the Boston Python Workshop, and related communities, I've now learned that I learn best when I'm around other curious, passionate, and respectful people whom I can teach and learn from and brag to, and in a physical space we dedicate to that activity for big stretches of time. Since then I haven't had the time to focus on improving those skills.
That sounds exactly like Hacker School. So I applied for the autumn batch, and I've been accepted. I will therefore be taking an unpaid personal leave of absence from the Wikimedia Foundation via our sabbatical program. My last workday before my leave will be Friday, September 27. I plan to be on leave all of October, November, and December, returning to WMF in January. During my absence, Quim Gil will be the temporary head of the
Engineering Community Team. I'll spend much of September turning over responsibilities to him. Over the next month I'll be saying no to a lot of requests so I can ensure I take care of all my commitments by September 27th, when I'll be turning off my wikimedia.org email.
When I'm in the zone, growing my programming skills, time is a blur, I feel powerful, and I am in awe of what we can make. And the more I think about doing Hacker School, having that feeling for weeks at a stretch, the more excited I get. So I'm thrilled that I can take three months off my job to come to Hacker School, so I can make tools to make my life easier, and so I can be a better community manager for MediaWiki (calling out easy bugs for newbies, running stats, packaging and customizing tools, etc.). I want to nurture the programmer side of myself, because programming is heady fun, and because the skillset will supercharge everything else I do. I'll be a more effective citizen, coach, and leader if I increase my fluency in code.
After all, it's going to take a lot of energy and innovation to improve the quality of open source software. We need open source software that ordinary people can use, with documentation in the languages users speak, and whose design addresses the needs of women and men worldwide. Whatever approach I take to that problem -- mentorship, platform-building, recruiting specific demographics, media-making -- I anticipate wanting to hack a lot of dashboards, APIs, courseware, wiki templates, poorly formatted datasets, CRMs, and helpful little scripts along the way.
Thank you, WMF, for the sabbatical program, and thanks to my team (especially Engineering Community Team's Quim Gil, Andre Klapper, Guillaume Paumier, and my boss Rob Lanphier) for supporting me on this; I couldn't do this without you. And thanks to the women-in-open-source community, especially the Ada Initiative, for helping me gain the confidence to take this step. (The Ada Initiative's trying to finish its fundraiser, in case you can help.)
If there's anything else I can do to minimize inconvenience, please let me know. And wish me courage!
You can hire me through Changeset Consulting.
This work by Sumana Harihareswara is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Permissions beyond the scope of this license may be available by emailing the author at firstname.lastname@example.org.