End-to-End Testing Flutter BLoC Apps with Patrol

I had a decent pile of bloc_test cases. Every event, every state transition, all green. And then a build went out where you literally could not log in, because a permission dialog popped up on a real device and the app just sat there behind it. My unit tests had no idea that dialog existed.

That was the day I actually sat down with Patrol.

So this is less a tutorial and more a brain-dump of what I wish someone had told me about testing a BLoC app end to end. I’ll use a dumb little to-do app for the examples so nothing here is tied to anything specific.

So what is Patrol, quickly

It’s an integration-testing framework that sits on top of Flutter’s own integration_test. Two things make it worth the trouble.

First, the finder API is just nicer. You write await $('Login').tap() and move on with your life instead of chaining find.text and tester.tap.

Second - and this is the actual reason to use it - it can touch native UI. Permission prompts, the notification shade, the system back button, WebViews. None of that is a Flutter widget, so plain integration_test can’t see it. Patrol can. That’s the whole ballgame for me, because real apps hit permission dialogs constantly and a test that freezes on one is useless.

How “real” should the BLoC be?

This is the question I go back and forth on every time, so here’s where I landed.

Real BLoC + real repository is the most honest, but it’s slow and it flakes the second the network sneezes. Skip it for anything but a smoke test.

Mocked BLoC is the opposite - fast, fully under your control, and you’re no longer testing your BLoC at all. You’re testing “does the widget render this exact state.” Sometimes that’s exactly what you want. Usually it isn’t.

The middle option is where I live: real BLoC, fake repository. The state machine runs for real - real events, real transitions, real rebuilds - and the only thing you’ve swapped is the thing that talks to the outside world. So it’s deterministic without being fake. That’s the default. I only drop to a mock when I need a state that’s genuinely annoying to reproduce, like a specific server error.

The boring first test

Entry point is patrolTest. You get a tester, everyone calls it $, and it does both finding and tapping:

void main() {
  patrolTest('increments the counter', ($) async {
    await $.pumpWidgetAndSettle(const CounterApp());

    expect($('0'), findsOneWidget);

    await $(Icons.add).tap();
    await $.pumpAndSettle();

    expect($('1'), findsOneWidget);
  });
}

Nothing clever here. $('0') finds text, $(Icons.add) finds an icon, .tap() taps it. The one thing worth internalising: pumpAndSettle() isn’t just for animations. It’s also what gives your BLoC time to emit the next state and rebuild. Forget it after a tap and your assertion runs against the old UI and you lose twenty minutes wondering why.

Wiring the real BLoC with fakes

The bit that made everything click for me was: pump the same tree your app actually uses, just inject fakes at the edge. I keep a tiny helper so every test starts from the same clean slate.

Future<void> pumpApp(PatrolTester $, {required TodoRepository repo}) {
  return $.pumpWidgetAndSettle(
    RepositoryProvider.value(
      value: repo,
      child: BlocProvider(
        create: (context) => TodoBloc(context.read<TodoRepository>()),
        child: const TodoApp(),
      ),
    ),
  );
}

Then a test just hands it a fake with whatever data it needs:

patrolTest('shows todos from the repository', ($) async {
  await pumpApp($, repo: FakeTodoRepository(seed: ['Buy milk']));

  expect($('Buy milk'), findsOneWidget);
});

TodoBloc there is the real one. It runs its real logic. Only FakeTodoRepository is made up, so the test never phones home and gives you the same answer every single run.

An actual flow

And because the BLoC is real, you can push a whole journey through it and trust that every step in between was genuine, not stubbed:

patrolTest('adds a todo end to end', ($) async {
  await pumpApp($, repo: FakeTodoRepository());

  await $(TextField).enterText('Write blog post');
  await $('Add').tap();
  await $.pumpAndSettle();

  expect($('Write blog post'), findsOneWidget);
});

That AddTodo event fired for real, the BLoC emitted a new state for real, the list rebuilt for real. That’s the part bloc_test can’t give you - the wiring between the state and the pixels.

The native dialog problem (the reason I’m here)

Back to the thing that started all this. Say attaching a photo triggers a runtime permission prompt. That prompt is an OS dialog. Flutter can’t tap it. Patrol reaches out and does:

patrolTest('grants permission during upload', ($) async {
  await pumpApp($, repo: FakeTodoRepository());

  await $('Attach photo').tap();

  if (await $.native.isPermissionDialogVisible()) {
    await $.native.grantPermissionWhenInUse();
  }

  await $.pumpAndSettle();
  expect($('Photo attached'), findsOneWidget);
});

Note the if. I wrap permission handling in isPermissionDialogVisible() because these dialogs are not consistent across OS versions, and a test that assumes the dialog is there will happily fail on the one device where it isn’t. Learned that one the hard way too. Other native calls I reach for: $.native.pressBack() for the Android back button, $.native.tap(...) for OS UI by text, $.native.enterText(...) for native fields.

When I do reach for a mock

Every so often you need a state the real BLoC makes you jump through hoops to produce - a particular error banner, an empty edge case, a timeout. For those I’ll mock the BLoC and just feed it a scripted stream:

class MockTodoBloc extends MockBloc<TodoEvent, TodoState>
    implements TodoBloc {}

patrolTest('shows an error banner', ($) async {
  final bloc = MockTodoBloc();
  whenListen(
    bloc,
    Stream.value(const TodoState.failure('Something went wrong')),
    initialState: const TodoState.loading(),
  );

  await $.pumpWidgetAndSettle(
    BlocProvider.value(value: bloc, child: const TodoApp()),
  );

  expect($('Something went wrong'), findsOneWidget);
});

But I keep these to a minimum. This test proves the widget reacts correctly to a state - it says nothing about whether the BLoC would ever produce that state. Lean on it too much and you’ve got a suite full of green that’s testing your mocks.

Stuff that bit me, so it doesn’t bite you

Give every test its own fresh BLoC and fresh fake repo - state leaking between tests is a special kind of hell to debug. Find things by text, key, or icon, not by poking at the widget tree, because the tree moves and text usually doesn’t. And after anything that fires an event, pump and settle before you assert. Ninety percent of my “why is this flaky” moments came down to one of those three.

That’s basically it

bloc_test and Patrol aren’t rivals, they’re just two different heights. One proves each BLoC behaves on its own; the other proves the app - real BLoCs, native dialogs, the works - survives a human poking at it. Real BLoC, fake repository, Patrol for the native bits, mocks only when you’re cornered. That combo has caught the stuff that actually reaches users for me, including that login dialog I never want to ship again.