Roshaan Contact me

← BlogAI Systems

I Built an AI That Reads the News Every Morning. The AI Was the Easy Part.

What it actually takes to make an automated intelligence system a company trusts — the prompt design, the failure modes, and the unglamorous reliability work nobody posts about.

Search "AI news digest" and you'll find a hundred versions of the same tutorial: connect GPT to an RSS feed, ask it to summarize, dump the result somewhere. Twenty minutes, very satisfying, looks like magic in a demo.

I built one of these for a client — and I want to be honest up front: the pattern is not novel. Pulling articles, sending them to a model, writing the output to a sheet is one of the most-built automations on the internet.

But there's a gap between that demo and a system a business actually relies on every morning, and almost nobody writes about it, because it's the unglamorous part. Connecting GPT to a feed is the easy 80%. The 20% that makes the output trustworthy enough to act on is the entire job. That's what this is about.

What it does — the 60-second version

My client used to spend the first hours of every day manually scanning industry publications across dozens of sources, trying to catch anything their competitors or their niche were doing that mattered. It was slow, and worse, it was lossy — a human skimming dozens of articles at 8 a.m. will miss things, and there's no way to know what they missed.

The system replaces it. Every morning it reads the day's articles across every source on a list, answers a fixed set of questions the company actually cares about, and writes the answers to a Google Sheet. On a quiet day it emails to say nothing relevant came up. The stack is deliberately boring: n8n for orchestration, GPT for the reading, Google Sheets for input and output, Gmail for alerts.

That's the part you've seen before. Here's the part you haven't.

The model is a component. The reliability is the product.

Hard problem 01A model that lies politely

The default failure mode of these systems isn't that they break. It's that they're confidently wrong, and you don't notice for weeks.

A language model is built to be helpful, which means when you ask it "did any competitor change their pricing?" and the answer is genuinely no, it will still reach for something plausible. It will infer, extrapolate, and hand you a clean, well-written answer that is quietly fabricated. A digest full of those is worse than no digest — because the client acts on it.

The fix is counterintuitive: you have to make the model comfortable saying nothing. The prompt explicitly instructs it to answer only when there's real evidence in the source text, and to return a hard NOT FOUND otherwise. That one constraint is the difference between a tool the client trusts and one they quietly stop opening.

Early versions did exactly the wrong thing — they'd take a speculative blog opinion and report it as a settled fact. That class of bug is what forced the evidence requirement: no quote from the source, no claim in the digest.

Hard problem 02"Summarize this" is the most expensive prompt in automation

The instinct is to ask the model to summarize the news. It's also the wrong instinct, and it's expensive in two ways.

It's expensive literally — summaries are long, so you pay for tokens to generate paragraphs nobody reads carefully. And it's expensive in attention — a generic summary buries the one sentence that matters under ten that don't, which means a human still has to read it. You've automated the typing and kept the reading.

So the system doesn't summarize. It interrogates. The prompt carries a fixed set of pointed questions — the specific things this business needs to know about its competitors and niche — plus one catch-all for anything else genuinely notable. The model isn't asked "what happened today?" It's asked "did this specific thing happen, yes or no, and if yes, where?" The output is short, scannable, and immediately actionable.

The shape of the question set (illustrative)

— Did any named competitor change pricing, packaging, or launch a product? Quote the line.

— Any regulation, lawsuit, or funding round touching our niche? Quote the line.

— Anything else a founder in this space would want to know today? Else: NOT FOUND.

Hard problem 03The web is hostile, and you have to plan for it

Demos run on clean inputs. Production runs on the actual web, which is a mess.

  • RSS quality is wildly inconsistent. Plenty of publications have no usable feed, so the system falls back to fetching and scraping the page. You have to assume this from day one, not bolt it on later.
  • Articles repeat. Without a recent-only filter and de-duplication, the same story gets processed twice and the model wastes effort re-answering yesterday.
  • Silence is ambiguous. If the system finds nothing and says nothing, the client can't tell "no news" from "it broke." So a quiet day triggers an explicit "nothing relevant today" email. Turning silence into a signal is a tiny feature that buys enormous trust.
  • News days vary. A heavy day can overflow the model's context window or quietly balloon the cost, so the volume going in has to be budgeted, not assumed.

None of this is glamorous. All of it is the reason the system still runs every morning instead of breaking the first week — for cents a day, in place of the chunk of every morning that used to disappear into manual scanning.

The decision I'm most sure about — don't own the questions

The most important design choice wasn't technical. The set of questions the system asks lives in the client's spreadsheet, not in my workflow — so the client can change what gets watched without touching code or asking me.

This goes against the engineer's instinct to keep control of the logic. But the people closest to the business know what's worth monitoring far better than I do, and a system they can steer themselves is a system that stays useful as their priorities shift. The intelligence isn't just in the prompt; it's in who gets to edit it.

When this approach doesn't work

Because the honest version of any build includes its limits: this pattern is great when the questions are stable and the signal is textual. It's a bad fit when you need real-time alerting (this is a daily batch, by design), when the sources are mostly paywalled or behind heavy JavaScript, or when the "questions" change constantly — at that point you're not monitoring, you're researching, and that's a different system. Knowing which problem you actually have is most of the work.

The real takeaway

This was never really a story about news monitoring. It's about the gap between an AI demo and an AI system — and that gap is almost entirely made of unglamorous, judgment-heavy decisions: making a model admit uncertainty, asking the right questions instead of generic ones, and engineering for a world that doesn't behave. The model is a component. The reliability is the product.

If there's a process someone in your business does manually every morning — reading, checking, watching — there's a good chance it can run itself. The hard part won't be the AI.

I'm Roshaan Ejaz, an AI Systems Architect. I design, build, and run production-grade AI systems inside companies — not fragile no-code demos. If you've got a daily manual process worth automating, find me at roshaanejaz.com.

Got a morning ritual worth automating?

If your team does something manual and repetitive every day, it can probably run itself — reliably.

Let's talk