Overcast Statistics

Podcasts are huge at the moment and they are pretty much all I listen to these days when I am out walking or in the car. I do this by the (mainly) excellent Overcast app. I say mainly because it has a UI that I struggle to find things in. Anywho, the one thing that it lacks (other than an intuitive UI) are decent stats and I really wanted that for my annual Review of the Year posts over on my personal blog.

After a quick DuckDuckGo I found the following post from James Hodgkinson with a Python script that spat out some simple stats. It was a good starting point so I converted it to PHP and extended it out to give me more of what I needed. The original script just gave totals across all podcasts and I wanted a breakdown by podcast.

What I ended up with was the following. If you want to use it remember to change the filename location and date range.

<?php
/**
 * Parses the "All data" OPML file from https://overcast.fm/account
 * and shows some simple stats.
 */

$filename = '<location to your>/overcast.opml';
$startDate = new DateTime("2023-01-01");
$endDate = new DateTime("2023-12-21");

$XMLDATA = simplexml_load_file($filename);

$played_episodes = 0;
$episodes = 0;
$podcasts = 0;

$feeds = [];

foreach ($XMLDATA->body->children() as $child) {
    if ((string)$child['text'] == 'feeds') {
        $feeds[] = $child;
    }
}

$results = [];
$i = 0;

foreach ($feeds as $obj) {
    foreach ($obj->children() as $playlist) {
        $podcasts++;
        $attributes = $playlist->attributes();
        $results[$i]['title'] = (string) $attributes['title'];
        $results[$i]['url'] = (string) $attributes['htmlUrl'];
        $results[$i]['count'] = 0;
        foreach ($playlist->children() as $episode) {

            // check date is in the range we want to check
            $attributes = $episode->attributes();
            $dateString = $attributes['userUpdatedDate'];
            $dateTime = new DateTime($dateString);

            if ($dateTime >= $startDate && $dateTime <= $endDate) {
                if ((string)$episode['played'] != "1") {
                    $episodes++;
                } else {
                    $results[$i]['count']++;
                    $played_episodes++;
                }
            }

        }
        $i++;
    }
}

// sort the results
usort($results, function ($a, $b) {
    return $b['count'] - $a['count'];
});

// Output results as an HTML table
echo '<table border="1">';
echo '<tr><th>Title</th><th>Count</th></tr>';
foreach ($results as $result) {
    if ($result['count'] > 1) {
        echo '<tr>';
        if ($result['url'] != ''){
            echo '<td><a href="' . $result['url'] . '">' . $result['title'] . '</a></td>';
        }else{
            echo '<td>' . $result['title'] . '</td>';
        }
        echo '<td align="right">' . $result['count'] . '</td>';
        echo '</tr>';
    }
}
echo '</table>';
die;

$i = 0;
// output results
while ($i < count($results)){
    if ($results[$i]['count'] > 0){
        echo $results[$i]['title'].' '.$results[$i]['count'].PHP_EOL;
    } 
    $i++;
}

?>

The script spits out an HTML table so to run you’ll need to do something like the following and specify where the output file should be placed.

php overcast.php > ~/Downloads/overcast.html

You’ll get an HTML file that if you open will give you stats that look a little like the following:

As you can see I like a good Goalhanger podcast! Anyway, I hope you find it useful and if you extend it at all let me know.

Leave a Reply

Your email address will not be published. Required fields are marked *