The following guidelines are for writing tests. Many unit testing articles may contradict these guidelines, but these guidelines have worked for LigaData.

  1.  Unit tests should always be standalone.
    • They should never rely on a previous test or set up a following test.
    • They should be run both by themselves and in groups multiple times to ensure they can operate just as well in both cases.
    • They should be run in random order for a few times to expose issues. This also applies to the second item.
  2. Unit tests should be clean to start and clean to finish.
    • Use test fixtures to set up and tear down tests appropriately. Any services that are no longer needed should be shut down, any files created should be deleted, and so on.
    • When testing unit tests, run multiple times in a row.
      • If they only pass the first time or under certain conditions, check the setup and teardown methods. Make sure things are clean.
  3. Mocks versus embedded services
    • Embedding a database can be alright for unit tests. That being said, if the test isn’t specific to any particular database, using a mock object/map may be preferable.
      • Unit testing using Cassandra is alright if testing the Cassandra storage library. If testing an adapter, metadata API, or the engine, try to avoid using a database.
    • Embedding Kafka and ZooKeeper are alright (and often necessary), but must be cleaned properly either before or after each test.
    • All that said, mocking the external services that LigaData uses is probably going to be quite error-prone and it would be better to use embedded services for stability reasons.
  4. Use unit tests to help design a feature.
    • Unit tests force code to be programmatically interacted with and helps to discover where improvements can be made.
    • Writing unit tests before writing production code can help figure out how to expect the feature to be used.
  5. Make classes unit-testable.
    • Making every method private isn’t always a good idea. Writing unit tests helps understand what needs to be public and what can remain private. Here are some thoughts on design:
      • Being able to run a service (such as the engine or metadata API service) programmatically is a requirement for unit testing (both for Kamanja developers and customers). A public API should exist for managing the server-side products (something Cassandra, Zookeeper and Kafka all do).
      • Configuration shouldn’t be unit tested. If the feature requires configuration, then configuration needs to be separate from the feature.
      • Supplying management, cleanup methods, and getters in a server utility class/object can make things easier to manage for any future unit tests written using that component.
      • Having a state-based getter can help in verifying the integrity. This can give information on whether or not it’s alright to move on to the next step of a test. (that is – are the adapters loaded?, is the metadata loaded? is the engine running? and so on).
  6. Throw custom exceptions, not print statements.
    • Not only can this make debugging from other components easier for other developers but it allows unit testing for negative cases easily.
  7. Don’t write end-to-end/integration tests.
    • These aren’t unit tests. Testing a UDF by calling it programmatically is a unit test. Calling a UDF by compiling PMML and calling it through metadata is not a unit test.
    • If a developer has to control too many moving parts, it probably isn’t a unit test.
      • If a developer is absolutely certain it has to be unit-tested, consider the design to make it more unit-test friendly.
  8. Unit tests should be quick.
    • If a unit test is taking more than a couple minutes to run, it’s probably too long. Consider whether it’s really a unit test.
      • This isn’t true in all cases. Sometimes things just take a while to get running. Still, keep it in mind.
  9. Unit tests are not manual tests.
    • If any manual components are found to the test at all, they need to be automated. A unit test should never require a manual step.