mocking.rst - kunit-website - Git at Google

 ====================================
 Fakes and Stubbing and Mocks, Oh My!
 ====================================

 This page seeks to provide an overview on mocking and a related task:
 redirecting function calls to test-only code.  Note: many people use the term
 "mocking" to refer to the latter (and that's fine!), but we'll try and keep the
 concepts separate in this doc.

 KUnit currently lacks specific support for either of these, in part due to the
 fact there's enough trade-offs that it's hard to come up with a generic
 solution.

 Why do we need this?
 ====================

 First, let's consider what the goal is. We want unit tests to be as
 lightweight and hermetic as possible, and only test the code we care about.

 A canonical example in userspace testing to consider is a database.
 We'd want to verify that our code behaves properly (inserts the right rows to
 the database, etc.), but we don't want to bring up a test database every time
 we run our tests.

 Not only will this make the test take longer to run, it also adds more
 opportunities for the test to break in uninteresting ways, e.g. if writes to
 the database fail due to transient network issues.

 If we can construct a "fake" database that implements the same interface, which
 is simply an in-memory hashtable or array, then we can have much faster and
 more reliable tests. Unit tests simply don't need the scability and features of
 a real database.

 Fakes versus mocks
 ==================

 We'll be using terminology roughly as defined in
 https://martinfowler.com/bliki/TestDouble.html, namely:

 - a "test double" is the more generic term for any kind of test-only replacement.
 - a "mock" is a test double that specifically can make assertions about how its
   called and can return different values based on its inputs.
 - a "fake" is a test double that mimics the semantics of the code it's replacing
   but with less overhead and dependencies, e.g. a fake database might just use
   a hash table, or a fake IO device which is just a ``char buffer[MAX_SIZE]``, or UML itself (in a sense).

 | Mocks generally are written with support from their testing framework, whereas fakes are typically written without them.
 | KUnit currently lacks any features to specifically facilitate mocks, so it's recommended to create and use fakes.

 Downsides of mocking
 --------------------

 Very briefly, using mocks in tests can make tests more fragile since they test
 "behavior" rather than "state."

 What do we mean by that?  Let's imagine we're testing some userspace program
 with gMock-like syntax (a C++ mocking framework):

 .. code-block:: c

 	void send_data(struct data_sink *sink)
 	{
 		/* do some fancy calculation to figure out what to write */
 		sink->write("hello, ");
 		sink->write("world");
 	}

 	void test_send_data(struct test *test)
 	{
 		struct data_sink *sink = make_mock_datasink();

 		EXPECT_CALL(data_sink, write("hello, "))
 			.WillOnce(Return(7));
 		EXPECT_CALL(data_sink, write("world"))
 			.WillOnce(Return(5));
 		send_data(sink);
 	}

 And now let's say we've realized we can make our code twice as fast with more
 buffering, effectively changing it to:

 .. code-block:: c

 	void send_data(struct data_sink *sink)
 	{
 		sink->write("hello, world");
 	}


 | Oops, now our mock-based tests are failing since we've changed how many times we call ``write()``!
 | Contrast this to a state-based approach where ``write()`` might just append to some ``char buffer[MAX_SIZE]``. In that case, we can validate ``send_data()`` worked by just using ``KUNIT_EXPECT_STREQ(test, buffer, "hello, world")`` and it would work for either implementation.

 A further downside is that the test author has to mimic the behavior
 themselves, i.e. the return values for each ``write()`` call. This means if
 the test author makes a mistake or tests just don't get updated after a
 refactor, the mock can behave in unrealistic fashion.

 This can and *will* eventually lead to bugs.


 Upsides of mocking
 ------------------

 | This is not to say that one should never test "behaviour", i.e. use mocking.
 | E.g. imagine we *wanted* the example test to validate that we only call ``write()`` once since each call is super-expensive.
 | Or consider when there's no easy way to validate that the state has changed, e.g. if we want to validate that ``prefetchw()`` is called to pull a specific data structure into cache.


 | It's also easier easier to use a mock if we want to force a certain return value, e.g. if we want to make a specific ``write()`` call fail so we can test an error path.
 | With our ``data_sink`` example above, it's hard for an append into a ``char buffer[MAX_SIZE]`` to fail until we hit ``MAX_SIZE``, but for real code that might be writing to disk or sending data over the network, failure could happen for ~any call. And it's valuable to test that our code is robust against such failures.

 Function redirection
 ====================

 | Regardless of what kind of test double you use, they're useless unless you can swap out the real code for them.
 | For lack of a better term, we'll refer to this as function redirection: how do I make calls to ``real_function()`` go to my ``fake_function()``?

 | In other test frameworks (Python's unittest, JUnit for Java, Googletest for C++, etc.), this is fairly easy.  This is because they rely on techniques like dynamic dispatch, which has language support.
 | We can and do re-implement dynamic dispatch in the kernel in C, but this adds runtime overhead which may or may not be acceptable in all contexts.

 The problem boils down to `adding another layer of indirection
 <https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering>`_
 and we have various options to choose from, which we'll describe below.

 For each of these, let's consider the following code:

 .. code-block:: c

 	static void func_under_test(void)
 	{
 		/* unsafe to call this function directly in a test! */
 		send_data_to_hardware("hello, world\n");
 	}

 Run time (ops structs, "class mocking")
 ---------------------------------------

 This is the most straightforward approach and fundamentally boils down to doing
 this:

 .. code-block:: c

 	static void func_under_test(void (*send_data_func)(const char *str))
 	{
 		send_data_func("hello, world\n");
 	}


 Being a bit more sophisticated, we can introduce a struct to hold the
 functions:

 .. code-block:: c

 	struct send_ops {
 		void (*send)(const char *str);
 		/* maybe more functions here in real code */
 	};

 TODO(dlatypov@google.com): write about "class mocking", `RFC here
 <https://lore.kernel.org/linux-kselftest/20201012222050.999431-1-dlatypov@google.com/>`_

 Pros:
 ~~~~~

 - Simplest implementation: "it's just code."
 - This is the only approach here where we can limit the scope of the
   redirection.

         - The subsequent approaches **globally** redirect all calls to
           ``send_data_to_hardware()``, potentially in code not-under-test we
           don’t want to mess with.
 - There are plenty of such structs throughout the kernel.

         - And users don't need any special support from KUnit.

 Cons:
 ~~~~~

 - ~Everyone knows about this convention but still want "mocking." It's not seen
   as sufficient by itself.
 - Requires the most invasive code changes if the code isn't already using this
   pattern.

         - Introduces runtime overhead (an indirect call, another function
           argument, etc.)
 - If ``func_under_test()`` is publicly exposed, but ``send_data_func()`` is not
   (most likely the case), users need to workaround this.
 - The `RFC for "class mocking"
   <https://lore.kernel.org/linux-kselftest/20201012222050.999431-1-dlatypov@google.com/>`_
   requires a lot of boilerplate, even after providing macros to take care of
   most of it.

         - This is fundamentally a limitation of C (as opposed to C++ where
           classes have language support). It’s unlikely we can improve much
           here.

 Compile time
 ------------

 TODO(dlatypov@google.com): write me

 Pros:
 ~~~~~

 - TODO

 Cons:
 ~~~~~

 - TODO

 Link time (__weak symbols)
 --------------------------

 TODO(dlatypov@google.com): write me

 Pros:
 ~~~~~

 - TODO

 Cons:
 ~~~~~

 - TODO

 Binary-level (ftrace et. al)
 ----------------------------

 TODO(dlatypov@google.com): write me

 Pros:
 ~~~~~

 - TODO

 Cons:
 ~~~~~

 - TODO

 TODO(dlatypov@google.com): include discussion on global functions/general statefulness.
 TODO(dlatypov@google.com): include section on worked example use cases.
	====================================
	Fakes and Stubbing and Mocks, Oh My!
	====================================

	This page seeks to provide an overview on mocking and a related task:
	redirecting function calls to test-only code. Note: many people use the term
	"mocking" to refer to the latter (and that's fine!), but we'll try and keep the
	concepts separate in this doc.

	KUnit currently lacks specific support for either of these, in part due to the
	fact there's enough trade-offs that it's hard to come up with a generic
	solution.

	Why do we need this?
	====================

	First, let's consider what the goal is. We want unit tests to be as
	lightweight and hermetic as possible, and only test the code we care about.

	A canonical example in userspace testing to consider is a database.
	We'd want to verify that our code behaves properly (inserts the right rows to
	the database, etc.), but we don't want to bring up a test database every time
	we run our tests.

	Not only will this make the test take longer to run, it also adds more
	opportunities for the test to break in uninteresting ways, e.g. if writes to
	the database fail due to transient network issues.

	If we can construct a "fake" database that implements the same interface, which
	is simply an in-memory hashtable or array, then we can have much faster and
	more reliable tests. Unit tests simply don't need the scability and features of
	a real database.

	Fakes versus mocks
	==================

	We'll be using terminology roughly as defined in
	https://martinfowler.com/bliki/TestDouble.html, namely:

	- a "test double" is the more generic term for any kind of test-only replacement.
	- a "mock" is a test double that specifically can make assertions about how its
	called and can return different values based on its inputs.
	- a "fake" is a test double that mimics the semantics of the code it's replacing
	but with less overhead and dependencies, e.g. a fake database might just use
	a hash table, or a fake IO device which is just a ``char buffer[MAX_SIZE]``, or UML itself (in a sense).

	\| Mocks generally are written with support from their testing framework, whereas fakes are typically written without them.
	\| KUnit currently lacks any features to specifically facilitate mocks, so it's recommended to create and use fakes.

	Downsides of mocking
	--------------------

	Very briefly, using mocks in tests can make tests more fragile since they test
	"behavior" rather than "state."

	What do we mean by that? Let's imagine we're testing some userspace program
	with gMock-like syntax (a C++ mocking framework):

	.. code-block:: c

	void send_data(struct data_sink *sink)
	{
	/* do some fancy calculation to figure out what to write */
	sink->write("hello, ");
	sink->write("world");
	}

	void test_send_data(struct test *test)
	{
	struct data_sink *sink = make_mock_datasink();

	EXPECT_CALL(data_sink, write("hello, "))
	.WillOnce(Return(7));
	EXPECT_CALL(data_sink, write("world"))
	.WillOnce(Return(5));
	send_data(sink);
	}

	And now let's say we've realized we can make our code twice as fast with more
	buffering, effectively changing it to:

	.. code-block:: c

	void send_data(struct data_sink *sink)
	{
	sink->write("hello, world");
	}


	\| Oops, now our mock-based tests are failing since we've changed how many times we call ``write()``!
	\| Contrast this to a state-based approach where ``write()`` might just append to some ``char buffer[MAX_SIZE]``. In that case, we can validate ``send_data()`` worked by just using ``KUNIT_EXPECT_STREQ(test, buffer, "hello, world")`` and it would work for either implementation.

	A further downside is that the test author has to mimic the behavior
	themselves, i.e. the return values for each ``write()`` call. This means if
	the test author makes a mistake or tests just don't get updated after a
	refactor, the mock can behave in unrealistic fashion.

	This can and will eventually lead to bugs.


	Upsides of mocking
	------------------

	\| This is not to say that one should never test "behaviour", i.e. use mocking.
	\| E.g. imagine we wanted the example test to validate that we only call ``write()`` once since each call is super-expensive.
	\| Or consider when there's no easy way to validate that the state has changed, e.g. if we want to validate that ``prefetchw()`` is called to pull a specific data structure into cache.


	\| It's also easier easier to use a mock if we want to force a certain return value, e.g. if we want to make a specific ``write()`` call fail so we can test an error path.
	\| With our ``data_sink`` example above, it's hard for an append into a ``char buffer[MAX_SIZE]`` to fail until we hit ``MAX_SIZE``, but for real code that might be writing to disk or sending data over the network, failure could happen for ~any call. And it's valuable to test that our code is robust against such failures.

	Function redirection
	====================

	\| Regardless of what kind of test double you use, they're useless unless you can swap out the real code for them.
	\| For lack of a better term, we'll refer to this as function redirection: how do I make calls to ``real_function()`` go to my ``fake_function()``?

	\| In other test frameworks (Python's unittest, JUnit for Java, Googletest for C++, etc.), this is fairly easy. This is because they rely on techniques like dynamic dispatch, which has language support.
	\| We can and do re-implement dynamic dispatch in the kernel in C, but this adds runtime overhead which may or may not be acceptable in all contexts.

	The problem boils down to `adding another layer of indirection
	<https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering>`_
	and we have various options to choose from, which we'll describe below.

	For each of these, let's consider the following code:

	.. code-block:: c

	static void func_under_test(void)
	{
	/* unsafe to call this function directly in a test! */
	send_data_to_hardware("hello, world\n");
	}

	Run time (ops structs, "class mocking")
	---------------------------------------

	This is the most straightforward approach and fundamentally boils down to doing
	this:

	.. code-block:: c

	static void func_under_test(void (send_data_func)(const char str))
	{
	send_data_func("hello, world\n");
	}


	Being a bit more sophisticated, we can introduce a struct to hold the
	functions:

	.. code-block:: c

	struct send_ops {
	void (send)(const char str);
	/* maybe more functions here in real code */
	};

	TODO(dlatypov@google.com): write about "class mocking", `RFC here
	<https://lore.kernel.org/linux-kselftest/20201012222050.999431-1-dlatypov@google.com/>`_

	Pros:
	~~~~~

	- Simplest implementation: "it's just code."
	- This is the only approach here where we can limit the scope of the
	redirection.

	- The subsequent approaches globally redirect all calls to
	``send_data_to_hardware()``, potentially in code not-under-test we
	don’t want to mess with.
	- There are plenty of such structs throughout the kernel.

	- And users don't need any special support from KUnit.

	Cons:
	~~~~~

	- ~Everyone knows about this convention but still want "mocking." It's not seen
	as sufficient by itself.
	- Requires the most invasive code changes if the code isn't already using this
	pattern.

	- Introduces runtime overhead (an indirect call, another function
	argument, etc.)
	- If ``func_under_test()`` is publicly exposed, but ``send_data_func()`` is not
	(most likely the case), users need to workaround this.
	- The `RFC for "class mocking"
	<https://lore.kernel.org/linux-kselftest/20201012222050.999431-1-dlatypov@google.com/>`_
	requires a lot of boilerplate, even after providing macros to take care of
	most of it.

	- This is fundamentally a limitation of C (as opposed to C++ where
	classes have language support). It’s unlikely we can improve much
	here.

	Compile time
	------------

	TODO(dlatypov@google.com): write me

	Pros:
	~~~~~

	- TODO

	Cons:
	~~~~~

	- TODO

	Link time (__weak symbols)
	--------------------------

	TODO(dlatypov@google.com): write me

	Pros:
	~~~~~

	- TODO

	Cons:
	~~~~~

	- TODO

	Binary-level (ftrace et. al)
	----------------------------

	TODO(dlatypov@google.com): write me

	Pros:
	~~~~~

	- TODO

	Cons:
	~~~~~

	- TODO

	TODO(dlatypov@google.com): include discussion on global functions/general statefulness.
	TODO(dlatypov@google.com): include section on worked example use cases.