Dive into a new way of experiencing Hacker News. By combining Bytewax, Proton, and Grafana, I've developed a customizable dashboard that delivers the stories and comments that matter most to you, in real-time. Break free from generic feeds and explore personalized insights that keep you updated with the tech community.
Analyzing Hacker News in Real-Time with Bytewax and Proton
If you're tired of the same old media algorithms that dictate what you see on platforms like Twitter or Reddit, this project offers a refreshing alternative. Leveraging the Hacker News API, I developed a customizable dashboard to curate personal Hacker News stories in real time, utilizing technologies like Bytewax and Proton.
Discover the Overview
In this repository, you will find a detailed exploration of how to analyze prolific Hacker News commenters and trending stories dynamically. Together with insights from Jove from Timeplus, I combined various innovative tools for this endeavor:
- Bytewax: An open-source project designed for easy, custom connections to multiple data sources, who excels at processing streaming data with the familiar nuances of Python.
- Proton: A powerful streaming SQL engine powered by Clickhouse, which enables fast processing of streaming and historical data, promising sub-millisecond latencies and impressive performance metrics.
Create Your Customized Dashboard
Below, you'll find an example of the personalized dashboard powered by a Bytewax pipeline that streams Hacker News stories and comments into Proton:
This dashboard can serve your unique Hacker News feed, enabling you to bypass the generic recommendations and focus on what truly matters to you.
Dive Into the Coding Details
Here's a glimpse of how the code is structured to create a custom input connector designed to poll the Hacker News API:
class HNSource(SimplePollingSource):
def next_item(self):
return (
"GLOBAL_ID",
requests.get("https://hacker-news.firebaseio.com/v0/maxitem.json").json(),
)
The flexibility of Bytewax allows for tailored data ingestion based on application needs, setting the stage for efficient real-time processing.
Writing Efficient Dataflows
Once you have the custom connector in place, you can construct an efficient dataflow to process incoming items from Hacker News:
def download_metadata(hn_id) -> Optional[Tuple[str, dict]]:
data = requests.get(
f"https://hacker-news.firebaseio.com/v0/item/{hn_id}.json"
).json()
if data is None:
logger.warning(f"Couldn't fetch item {hn_id}, skipping")
return None
return (str(hn_id), data)
This example shows how to download specific metadata for a Hacker News item, ensuring that we gracefully handle errors and keep the process flowing smoothly.
Powerful Real-Time Analysis with Grafana
The integration of Proton lets you easily turn the incoming data into a rich analytic experience with Grafana. With pre-built queries and materialized views, you can effortlessly visualize the data:
SELECT * FROM hn_stories WHERE _tp_time > earliest_ts();
The dashboard created through this process lets you monitor and interact with Hacker News like never before!
Get Involved
Explore this repository to learn more about how Bytewax and Proton work together to provide a real-time analytic solution for Hacker News. If you find this project helpful, feel free to share your own implementations on platforms like Reddit or Hacker News, and don't hesitate to support us by starring our GitHub repositories!
⭐ Bytewax
⭐ [Proton](https://github.