Integration testing in Flutter

Team Glean's Cameron gives us his take on a new approach to integration testing with Flutter

5 min read

Published: 19 Aug 2021

Cameron McLoughlin

Integration tests are an important part of any software testing strategy. They provide stronger guarantees of correctness than unit tests, but at the cost of being harder to write and slower to run. However, using Flutter and the new integration_test package lets us close the gap somewhat, while still giving us the same peace of mind that we expect from integration tests.

Migrating from 'flutter_driver'

When Flutter launched, integration tests were written using flutter_driver, which allowed programmatic control of a Flutter app. When run with the flutter drive command, Flutter would spawn 2 processes:

A "driver" process on the host machine (i.e. the developer's laptop) that would send instructions to and receive data from the app
The app itself, configured to listen to incoming connections from the driver process

This allowed the developer to write tests in plain Dart that could interact with the app. However, it had some major drawbacks:

The driver code couldn't share any code with the app, or unit tests. In fact, because of the way it was compiled, it couldn't use any package:flutter dependencies
It relied on strongly-typed APIs, because of its inability to import Key, Widget or MyCustomWidget. find.byType('EventsPage') is easy to mistype, and even easier to misread
Any communication happened over the RPC channel between the 2 processes. If you wanted to read some internal state of the app, you would need to serialise a message and register a special handler in your app.

Enter 'integration_test'

The integration_test package was released to fix some of these issues. The main difference is that the code for tests runs in the same isolate as the app code itself (meaning it has access to the same memory). This essentially solves the issues listed above, as well as a few other nice benefits:

Same API as component tests
Able to share code with the app
Internal app state is totally visible to tests, runApp is called inside the test
Since the tests are built into the app executable, they are now compatible with Firebase Test Lab, for running tests on physical devices

Before diving into the changes, a word on page objects.

Page objects are a simple abstraction that make it easier to read and write widget tests. Flutter tests give you very fine grained control over the lifecycle of the app, but sometimes this is overkill, and we just want a "sane default behaviour". For example:

// without page objects
expect(find.byType(EventsPage), findsOneWidget);  // check we're on the "Events Page"
await tester.tap(find.byKey(EventsPageKeys.newEventButton));  // tap the new event button
await tester.pumpAndSettle();  // request frames to be scheduled until there are none left
expect(find.byType(StartEventPage), findsOneWidget);  // check we're on the "Start Event Page"

// with page objects
final startEventPage = await eventsPage.newEvent();
final recordEventPage = await startEventPage.startRecording(eventName: 'My New Event');

However, since these depend on flutter, we couldn't use them in driver tests, so for a while we maintained 2 sets of page objects. Not only that, but the underlying APIs (flutter_test and flutter_driver) had quite a few subtle differences in behaviour that were unintuitive and error-prone.

Page objects vs. a real network

With integration_test, we could use these page objects in our integration tests as well, but some modifications had to be made. Generally, it's common to use tester.pumpAndSettle() in unit tests to "finish" your last input (e.g. a tap, a scroll, etc). It tells the test environment to:

build and render a new frame
check to see if new frames are scheduled
repeat until there are no more frames scheduled

This gives a really nice behaviour in unit tests, since network calls are typically mocked and can resolve synchronously (or at the very least before the next frame). It's common for buttons to have subtle animations that may take 10-15 frames that wouldn't be caught by a single tester.pump().

In integration tests, however, if the animation finishes before the network request, the app will have reached stage 3 (no more frames scheduled), and pumpAndSettle() will return, and your test will likely fail. To get around this, our page objects needed a little tweaking. Before migrating, when a page object "tapped" something it would call this:

class PageObject<T extends Widget> {

    // fields and other boilerplate   

	Future<void> tap(dynamic finder) async {
		await tester.tap(wrapFinder(finder));
		await tester.pumpAndSettle();
	}
}

Here, wrapFinder is a utility method that improves ergonimics slightly. It just lets us write page.tap('Event name') instead of page.tap(find.text('Event name')).

This naive approach was fine, as long as any network calls that needed to happen finished within the call to pumpAndSettle. When this wasn't the case, we needed to wait for something to be visible:

Future<void> tap(dynamic finder, {dynamic waitFor}) async {
	await tester.tap(wrapFinder(finder));
	await tester.pumpAndSettle();

	if (waitFor != null && isIntegrationTest) {
		// add some delay, then check if waitFor is visible, with some exponential backoff
	}
}

Then when we define our page-specific methods, we can pass in a value to waitFor when we know that we want to wait for something. Often, however, we follow a fairly standard pattern of: button press -> wait for network request -> go to new page. For this default case, we can do better! There is a variant of tap called tapAndNavigate, which just performs a tap, then returns a page object for a different page:

Future<PageObject<S>> tapAndNavigate<S extends Widget>(dynamic finder) async {
	await tap(finder, waitFor: find.byType(S));
	return PageObject(tester);  // the type of the new page is inferred here
}

But since we know that, when changing to a new page, we usually want to wait for it to appear (and since Dart supports reified generics), we can simply pass find.byType(S) in to make sure we wait into the new page has appeared.

This doesn't apply everywhere of course, and some methods needed manually overriding, but it helps to narrow the gap between integration tests and widget tests.

All in all, this makes our integration tests just as easy to write as widget tests. For example, here's a widget test for the login page (comments mine):

testWidgets('it should allow a user to log in and log out', (tester) async {
  final loginPage = await launchAppLoggedOut(tester);
  await loginPage.logInTo<EventsPage>();  // default credentials for mock api here
  final appMenu = await pressAppMenuButton(tester);
  await appMenu.selectLogOutOption();
  await page<LoginPage>(tester);  // a helper function to allow async work in "constructors"
});

And now, the corresponding integration test:

integrationTest('should allow logging in and logging out', (tester, commands) async {
  final loginPage = await page<LoginPage>(tester);
  // "test helper" api available on testing environments, create a real user and return details
  final user = await commands.api.createUser();  
  await loginPage.logIn(user.credentials);
  final appMenu = await pressAppMenuButton(tester);
  await appMenu.selectLogOutOption();
  await page<LoginPage>(tester);
});

Firebase Test Lab

All this is reason enough to use integration_test, but there's an extra cherry on top.

Firebase Test Lab is a Google Cloud product that allows developers to submit Android or iOS native test binaries. As mentioned, this was previously impossible, because flutter_driver required a Flutter-specific driver process running on the host machine, which was not supported.

But since all of the code is bundled into the native binary that Flutter builds for us (e.g. libapp.so on Android), it behaves exactly like a regular native app.

We're still in early stages of exploring Firebase Test Lab, but initial impressions are promising. One aspect I haven't spoken about is performance.

Flutter apps can build in 3 modes, debug, profile, and release. There is a very large performance penalty imposed by debug mode, but in exchange you get access to hot reload and devtools. When profiling, apps should be run in profile mode (surprisingly!). However, when running on an emulator (e.g. in CI), profile mode is disabled, because emulator performance isn't indicative anyway.

Firebase test lab uses physical devices, which lets us get over this hurdle and get accurate performance metrics for our builds, automated as part of a CI pipeline. We even get a video of the tests running, which can be very valuable when debugging a flake.

All in all, with well-organised unit tests, and a little time, integration_test makes it easy to write high quality integration tests for Flutter apps.