Notes – Chris Hardie's BSU Capstone Project

Changelog

While working on the software code I’ve been keeping a list of changes over time, using the software tracking tool git. I thought I’d share that “changelog” here for posterity, as I won’t include it in the initial “release” of the software package for public consumption.

These entries may not be entirely clear as they’re usually shorthand summaries of the work that’s just been done. Sometimes they represent just a line or two of code changes, and sometimes they represent many new functions and features coming in all at once. But here they are (in reverse chronological order as of today) in case they’re useful:

2022-04-09

Add but don’t use a welcome widget, need to work out width issues
News item trend chart
Latest news dashboard widget and starred feed activity count
Remove default account widget
Initial setup for dashboard stat widgets
Reorder sidebar for daily workflow
Enable dark mode and collapsible sidebar on desktop
Add Filament config to our package’s published configs for simplicity
Don’t show activity section in create feed form
Disable manual creation of news items
Specify feed check freq when importing, better docs for import/export
Default to only last 7 days of news, show more records per page
Make news items globally searchable with useful result formatting
Create artisan command, job and schedule for checking bulk feeds
Make it look better on smaller devices
Align image center within container
WIP: news item media display in curation view
Support different title generation methods, move fetch history timeframe to config
Allow for apparently ridiculously long article URLs
Include parent methods in configuring and booted so Filament can load Livewire

2022-04-08

Refactor base class structure for better reusability of last check/update functions
Docs
Docs and clarifications

2022-04-07

Icon update for FB items until we can get a real one in there
Include feed names in curation view
Basic FB post fetching is working
WIP: adding bulk feed handling

2022-04-06

Start adding some better documentation in README and config
Set up package command scheduling
Add console command to refresh Crowdtangle feeds

2022-04-04

Action to refresh/create feeds from Facebook Pages and Groups
Basic setup for Crowdtangle API access

2022-04-03

Basic news item index tests
Add livewire test to check Source creation through Filament
Ridiculous scaffolding to support authenticated users when testing Filament views
Load Filament and base laravel migrations so we can test Filament
Maybe help testbench with package discovery
Add example app key to phpunit for local testing
Include Livewire service provider so tests will pass
Included deleted feeds and sources for purposes of relationship fetch
Visually indicate that feeds are failing in the table
Have content curation record URLs go to the article in a new window
Expand length of media_url field to accomodate ridiculous image URLs
Add feed export command
Feed importing is working
Initial setup for feed importing

Continue reading Changelog

Side project: CrowdTangle API PHP library

One of the requirements for the tool I’m building is to integrate with the CrowdTangle API to pull the latest posts from Facebook groups and pages for consideration to be included in Bloomfield’s Daily Bulletin.

Since Facebook does not provide API access to Facebook page or group content unless you are an owner/administrator of those pages and groups, CrowdTangle is, as far as I’m aware, the one and only official tool available to journalists and researchers who want to have programmatic access to updates from a large number of Facebook groups and pages.

Here’s what the dashboard looks like on the CrowdTangle site:

CrowdTangle has some good documentation for their API but I was not able to find much in the way of tooling or libraries for interacting with their API, even though it’s used widely by journalism organizations.

Since I would need to build that kind of library for this capstone project anyway, I decided to do it in a way that could benefit other journalism organizations, and created it as an open source package that is now available on GitHub:

So now, anyone building an application in PHP that needs to integrate with CrowdTangle will have a head start that hopefully helps them get up and running much faster.

I let the Facebook/Meta/CrowdTangle folks know about it and they are currently testing it out for inclusion in their documentation. I also shared about it on my personal technology blog. And of course, I’m now making use of it in the Bloomfield web application.

Tools for building the software

Now that I’m on my way in actually building the first version of the news harvest tool, I thought I’d say a bit about the software development process itself.

We’ve decided to build this web application using the Laravel application framework. Created in 2011, Laravel is written in the PHP programming language and used around the world for building everything from personal websites to SaaS applications to mission-critical business tools. It’s also a very stable framework with a large community of open source contributors and developers for hire, which means that finding people to add and maintain Laravel functionality is relatively easy compared to more specialized or proprietary tools. All of these things combined set it up as a good choice for the Bloomfield folks to build on in the long term.

In February, I wrote on my personal technology blog about the tools I use for Laravel-based projects, and the list remains applicable here:

My standard Laravel development tools

To organize and track the work itself, since I’m the only developer working on the project, I’m mostly referring to the mockups, data model and then using a simple text file broken down into “DONE,” “TODO,” and “LATER” sections:

Hard parts of thinking through the content curation workflow

Over the last few weeks I’ve been working with Simon to talk through and refine some wireframe mockups of how the new application/tool could work. I’ll share about those separately soon, but one interesting and important conversation that came out of this process is about how the Bloomfield team does their review of content that comes in through the existing news harvest process, and how that should be reflected in a new workflow and tool.

I made an assumption in the process of developing the mockups that it could work well to see a content item’s original headline in an administrative “curation” view and then, for purposes of rewriting the headline for the project’s Daily Bulletin, either open a new window to view the article/content in its original content, or just go straight to editing the headline in a pop-up modal within the tool.

Here’s one of the original mockups from March 1st showing the proposed feed curation view:

You can see the “Open” and “Edit” buttons for those two proposed ways of working with the content. Open would open a new browser window to the original article, and Edit would do something like this:

Continue reading Hard parts of thinking through the content curation workflow

Consolidated or Distributed Site Structure?

One conceptual conversation Simon and I had recently was about how far to go in trying to consolidate the “news harvest” workflows and tools with the publishing and distribution workflows and tools.

At present, the workflow is distributed across many different tools and services. (A single news item that ends up in the Daily Bulletin publication might touch or be viewed in a web browser, a Zapier zap, a Slack channel, a Google Sheet, a WordPress publishing UI, a MailChimp publishing UI, a social media account and more.) While this allows for a lot of flexibility, it also creates a lot of places where the team has to think about how an item will appear, and how all of those services will connect with each other to make sure it does appear.

Building something new presents an opportunity to consolidate significantly, but we don’t want to go so far towards “one tool to rule them all” that the project can’t easily evolve or adapt to future needs. Clearly part of the project’s success has been how agile the team is and how quickly they can spin up a new tool or service to add value to what they’re doing. I imagine that this tradeoff is something newsrooms around the world have to consider on a regular basis.

To help with the conversation, I tried to visualize two main scenarios I thought we were considering and put together some diagrams.

Scenario A is a centralized administrative tool, but still with separated out publishing channels:

The news harvest functionality on the far left all feeds into a central admin tool, which in turn publishes the curated and updated information to other tools hosted and managed elsewhere, which are in turn the public-facing experience that community members will interact with.

A key benefit of this approach are that we don’t have to reinvent some key functionality: WordPress already has content management system functionality covered, MailChimp already has mailing list functionality covered, and so on. By abstracting the information management workflows from publication and viewing, we keep our new tool focused and lightweight, making it easy to add new features to that part of the workflow.

Continue reading Consolidated or Distributed Site Structure?

Side projects so far

In my initial time working with the Bloomfield Information Project recently, it became clear that there were some short-term, narrow-scope technical projects that could lead to immediate improvements in the workflows of the BIP team. My proposal references creating “software tools and functionality that solve some of the technical and user-facing needs of the project” and while most of that work is still yet to come, I don’t want to leave out some of those achievements already launched or completed in the form of smaller “side” projects I’ve worked on:

Custom RSS Feeds

So much of the daily workflow for the team revolves around scanning information sources and organizing what they find. In some cases that can be “simple” because the information comes in a standard format such as an RSS feed already available on a news provider’s website. But in other cases, that information is hard to reach, and may require several steps, such as logging in to an account, performing a search, filtering and scanning the results, copying and pasting the details.

So, for those harder cases where it seemed like the information should be available in a programmatic or structured way but wasn’t, Simon and I started looking at how we could use software to make things simple.

The end result was a Laravel application I created that scrapes or otherwise converts publicly available information into a useful RSS feed. It makes use of another Laravel package I’d already created for personal projects, laravel-feedmaker. The end result is a simple website that hosts these automatically updated RSS feeds:

We set it up to live at feeds.bloomfieldinfo.org and it currently contains feeds for a variety of sources: local and regional news sites, a public notice database, a government RFP database, government meeting videos, local news from national news websites, and more. In each case the feed links the BIP team member directly to the item on the original source, so we’re not republishing or copying any content directly.

Then, the team member (or even other users out on the Internet) can use their own RSS reading tools (Feedly, Zapier, or other) to process and work with those feeds just like they do the rest of the sources they scan. Simon tells me this evolution has already been helpful in their daily workflow.

Continue reading Side projects so far

Background Information

Here’s some background reading that helped inform my capstone proposal and that provides context on the people and projects involved:

This Twitter thread from Simon covers a lot of ground:

Last week I published this piece summarizing our first few years of work on info districts and mapping out how we're turning that vision into a new standard for access to local news, information, and civic engagement.https://t.co/MlDZzcVHk9

— Simon Galperin (@thensim0nsaid) January 24, 2022

Some of the links mentioned in the thread:

Building a new model for community-centered local news (Simon Galperin, January 20, 2022)
Towards a public choice for local news and information (Simon Galperin, June 2019)
5 Business Models for Local News to Watch in 2020 (Knight Foundation, January 2020)
Could New Jersey be the home for a new solution to the local news crisis? (Nieman Lab, April 2020)
Information Districts Are a Popular Way to Expand Funding for Public Media (Data for Progress, July 2020)

Here’s a bit more about the Bloomfield Information Project and the existing Zapier-powered workflow that they use for their news harvest.

Here’s the It’s All Journalism podcast interview where I first heard Simon and learned about these projects.

After listening to the conversation, without this capstone project even on the radar, I emailed Simon to express excitement and affirmation for his work, and to share a bit about my own projects and background.

One part of that background is that since 2011 I’ve been maintaining a web-based tool in my own community that aggregates and shares local news headlines and other information. I recently re-launched it at WayneCounty.info and continue to build on it today. My main goal with it is to increase community awareness about news, events and happenings in the area, increase civic engagement based on that information, and help people detangle their awareness and engagement from social media platforms that incentivize problematic behaviors and the spread of misinformation.

My approach has been very focused on software automation. Simon’s has been focused on human-led curation with software tools to scale that process. There’s a lot to explore at each end and in between.

Simon and I set up a call on August 4, 2021 to chat more, and found we had a lot of shared interests and goals in local news production models. We continued to exchange emails and geek out over tools and workflows for the following months, and the conversation continued from there. At the end of 2021 I proposed collaborating as a part of my capstone project, and here we are!