After a bit of a hiatus, this newsletter is officially back. Thanks for sticking with it and expect one of these in your inbox every other Friday from here on out.
Open records laws are wonderful tools for journalists and other citizens to hold government agencies to account. They force the government to reveal uncomfortable truths about how they work (and often how they don’t) and help the public to greater understand the effects of politics and policy.
But as any data journalist knows, getting the information from an agency is not even half the battle. While you would think that agencies would turn over data in the actual data formats in which they exist, that is often not the case.
Anyone who has requested data from a government agency knows that dread of opening an email containing their data only to see the file extension PDF. I wish I could say that it was a right of passage for first-time requesters, but I cannot.
A few years ago, The Washington Post endeavored to understand murder in major American cities and initially sent 50 records requests to police departments in 50 of the largest cities across the country. Many departments turned over Excel files, thankfully, but many more turned over PDF files of their spreadsheets. One even turned over a handwritten database of murder in their city. But that’s a story for a different time.
While I would rather not spend significant amounts of my time putting data back in its original format, that is unfortunately part of the job. But if I am going to have to do that, I might as well do it as efficiently as possible. There are many ways to get the job done but using the right tool is your best bet.
In this case, the best tool was Tabula. Most of the PDFs were already text-based and in spreadsheet format, so Tabula was able to extract much of the data without issue. For the tough stuff I had to get creative but generally speaking, Tabula was the best and fastest option.
For one city, we only got narratives of every murder. For that, I had to create my own tool for the job. Writing a script in python, I extracted pieces of each narrative into fields. I had to clean up a few irregularities but overall it worked well. Sometimes you have to get creative and make your own tools.
Last year, I got a chance to build my first chair. Well, chairs. I took on an order for six dining chairs all at once to go around the table of a couple friends. The chairs were to be built identically to one another with two chairs, for each end of the table, built slightly larger than the rest.
Setting out, there were a lot of challenges. First, finding wood that is wide enough for the seat was a challenge. With some wide wood, mortises and tenons, I was able to glue up the seats comfortably to hold a lot of weight.
As I’ve written about previously in this newsletter, one major key to replicating the same thing over and over again is using a jig. A jig helps take out changes from item to item and allows for symmetrical work over many iterations. For the chairs, it meant creating tapers in 24 legs and curvature in 18 seat-back rails.
For the legs, each leg was tapered in on two sides, requiring 48 total cuts that needed to be the same. Luckily, I had previously bought a simple taper jig to go with my table saw and I was able to run all the legs through relatively quickly and with ease. There are many ways to get the job done but using the right tool is your best bet.
The part that really required my thought was the rails. There are many ways to make rails but it is quite difficult to get them all the same and creating a smooth curve can be difficult. I chose to cut the curves with my bandsaw, but figuring out how to draw the shape was proving difficult. Normally a drawing compass is a great tool for drawing curves but not for something so thin and wide.
I finally figured it out when I was cleaning up the shop one day and came across a big piece of scrap wood from a round conference table I built earlier in the year. The curvature of the cut, based on the edge of a circle with a five-foot diameter, worked perfectly for tracing curves on each rail.
To cut the curves well, I had to put a fresh blade on the bandsaw and everything cut like butter. I had a few approaches, but ultimately this was the one that I felt would create the most consistent rails. Sometimes you have to get creative and make your own tools.
datawork
U.S. Marshals Act Like Local Police With More Violence and Less Accountability — “We found that at least 177 people were shot by a marshal, task force member or local cop helping in a marshals arrest; 124 people, mostly suspects and a handful of bystanders, died from their injuries. In addition, seven committed suicide after being shot. On average, from 2015 to late 2020, they shot 31 people a year, killing 22 of them. By comparison, Houston police reported shooting an average of 19 people a year, killing eight. Philadelphia officers shot an average of nine people a year, killing three. Both departments employ roughly 6,000 officers, about the same number who serve in the Marshals Service and on its task forces.” [The Marshall Project and The USA Today Network]
Despite pandemic, Portland public housing contractor keeps churning out evictions — “When it comes to evictions, Home Forward, the state’s largest provider of affordable housing, has earned a reputation among many local tenant attorneys and advocates as refreshingly tolerant with the vulnerable and low-income tenants it serves in the properties it manages directly. But court records reviewed by OPB show the property management company that Home Forward contracts with for much of the nonprofit’s housing stock — including Pearl Court — takes a different tack: since April, Income Property Management has filed to evict nearly three times as many households out of properties it oversees for Home Forward compared to Home Forward itself.” [Oregon Public Broadcasting]
Pervasive vaccine inequity for minorities persists all over NC, new records show — “Minority communities in all corners of North Carolina continue to be under-represented, in some cases at alarming rates, when it comes to getting coronavirus vaccines, an Observer analysis of new state data found. At issue is the rate at which people of different races and ethnicities are getting the COVID-19 vaccine compared with their overall population in a given county. The Observer’s analysis of state data released last Friday found significant racial and ethnic disparities in urban, suburban and rural counties, and from the coast to the mountains.” [The Charlotte Observer]
900,000 infected. More than 15,000 dead. How the coronavirus tore through D.C., Maryland and Virginia. — “By December, hundreds of thousands would be infected in the region, including a Virginia widow who placed her late husband’s ashes in a church portico on a recent sunny day; a D.C. nursing home aide who struggled to climb the steps of her apartment building that same morning; and a priest in the Allegheny Mountains who was tethered to an oxygen machine, grateful for friends who help with the laundry. The march of the virus through the District, Maryland and Virginia has exposed cruel disparities influencing who gets sick and who survives. But at each stage of the pandemic, the virus has also shown that anyone — from the poor to the powerful — can become a victim.” [The Washington Post]
woodwork
If you enjoyed the newsletter, please like, subscribe and share. This newsletter will always be free but if you’d like more exclusive content or to support the woodworking done by the databae woodshop, please consider contributing to my Patreon.