Startup Engineering

Finbourne - How a Fintech tackles dual meanings of time

Episode Summary

Lots of systems store events that happen at moments in time. But what if the timestamps can have more than one meaning. Tom the co-founder of Finbourne explains how they use bi-temporal data and event sourcing to build a consistent view of portfolio data where they can look back at any point in time and find out the two truths. What did system see at that time and what did the world see. Find out the difference between "effective at" and "as at" and learn what happens when you need to make corrections to the timeline.

Episode Notes

Resources:

Episode Transcription

Rob: Welcome back to Startup Engineering, where we go behind the scenes at startups, I'm Rob De Feo startup advocate at AWS. And together we'll hear from the engineers, CTOs and founders at some of the world's leading companies. From the first user to tens of thousands of requests per second and everything in between. Expert share their experiences, lessons learned, and best practices. Tom, our guest is the co-founder and former CTO and now CEO of Finbourne Technology. Before that he spent his career building technology and large financial institutions like UBS. Tom can you get us started by telling us about Finbourne and what problem it solves?

Tom: We set up about three years ago to structurally change the cost of investing for everyone, and that's a very ambitious target. But we didn't try to do it the way that you'll hear an awful lot of companies or startups talk about disruption. Instead, what we did was try to think of it from an efficiency point of view. What we're doing is building a platform to store all the data they need to run their business. Our customers tend to be asset managers and hedge funds, and the data they tend to need to store is data like what trading have they done, what market data they're using to value those assets they hold. How do they make sure that they safely are entitling only the compliance officer to see certain fields and their investment managers to see the data they need to run their trading book. We solving quite a broad range of problems, but all of them are typically in the space that makes an asset manager or a hedge fund more efficient.

Rob: To me hedge funds are a little bit of a black box, I don't exactly know how they work or what they do. And you're the opposite, you're very open about the way that you build and the way that you approach your business when working with hedge funds did this work?

Tom: Yeah, I think in that respect we're not really doing terribly new. Over the past 10, 15 years from what we've seen an awful lot of the breakthroughs have been through people who have been open about how they develop software. OpenAPIs, GitHub, all of that sort of either open source or shared community around developers has led to a transformation in productivity, in ecommerce, in scheduling and bookings, in online services generally. And what we found is that hedge funds and asset managers and the investing community hasn't really taken to it in the same way when you look at the complex end of their business. When you look at some of the simpler end what's in my bank account, people like Monzo or Revolut have looked at that end of the market. But for stuff like what's in my pension fund and what's in my book of business if I'm an asset manager, it hasn't really followed the same approach and that's where we come in.

Rob: You said that asset managers are looking for more detail. Can you describe exactly what data and information they get from your product?

Tom: What they get is the ability to look through everything they've ever done on the platform. We have APIs that allow them to look at the list of portfolios they're managing, to look at the list of transactions they've done in any portfolio, to store and look at the market data that they tend to use, to look at that from a multitude of providers, and then to run calculations against it, either using our software or other people's software.

Rob: From what you're describing, there appears to be large amounts, but of small pieces of data that you need to process to create this view. So what were the engineering challenges that you had when you were creating this product?

Tom: If you think about the types of data that we're trying to present to people, it has I guess, disparate shapes. Some of it is very much a point in time data, so here's the prospectus I have from my fund. Well, that's a big piece of data and it really only happens once. If you look at the trading activity in the fund, well that's very, very long histories. You know you could have done over the last 15 years 150, 200 million trades. And what you need to do to figure out what you want today is go through that entire sequence of trades and add them all up in the right accounting methodologies to figure out what my position is right now. They're kind of two of the extremes of the data types that we tend to look at.

Rob: The second example you described sounds much more complicated just due to the fact it could be so much more data, you're talking about millions of trades. What were the challenges that you had when you're thinking or building a system that can cope with having to process this much data?

Tom: One of the efficiencies we're trying to build into people is a guarantee around reproducibility of that data. What we want to be able to do is go through those 150 million entries and figure out what your current position is, but do it in a way so that if you ask the platform for five pm last night, where was I sitting in terms of my holdings, we will always give you the same answer. There's very few platforms. In fact, I'm not really aware of any out there that will give you that kind of answer.

Rob: One of the things that really stuck in what you just said is you guarantee that you give the same answer. And this is quite unusual because intuitively with technology, if you think of a calculator, put the same values in and I get the same values out. What makes this particularly difficult to guarantee the same output?

Tom: If you consider how people have modelled this in the past, they look at a relational database. And for the example of that portfolio, I'll go with a table that has the details of the portfolio. So what index or benchmark is it tracking who's actually running the trades in it and the whole pile of other metadata associated to it. Then inside of that you'll have a link to a transactions table and that transactions table will typically have the time for which the transaction takes effect. So I bought some Vodafone shares yesterday and some Apple shares the day before, and then I add them all up and I come to the number of units I hold in each of those securities and indeed how much it cost me to get into that position. By doing that, I have my current position and indeed the amount of money I've made or lost by doing that trading. But that's really not enough to allow you to reproduce it, because if I go back and discover that I actually made a mistake and I didn't buy 100 Vodafone shares yesterday, I bought 200. Well, now I have to go into that table and change the 100 to 200. And if I go back and say "what did I hold last night?", the answer will say two hundred and not one hundred. That's kind of the problem that we're trying to cater for in a very general sense.

Rob: I see, what you're describing sounds like something I've tried to build in the past and then have problems when I've had to change the historical data but still represent what it was at that time and now what it is here, and then run into all sorts of problems trying to figure this out. What's the solution that you came to?

Tom: If you if you think of the history of what you would have done in the past, the next thing that people would tell you to do is put an audit table in place on the database. And then what you have is the ability to go back and see what happened in the data store and look at the audit table to see who changed it. But that still doesn't give you is a nice general paradigm where you can say "what did it looked like yesterday?" without starting to then write some very complicated store procedures. So you end up going, well I want to ask what the system looked like last night. So now I need to look at the main table and the audit table and know exactly when and when not to apply each of those entries. And what we've seen in the past is that typically when people go about trying to solve this problem bi-temporal data, you end up in the scenario where the database logic to write is actually nigh-on-impossibly complex.

Rob: You mentioned some of the difficulties when querying bi-temporal data before we get into that. Can you describe what bi-temporal data is when it's used?

Tom: Temporal data is data that has a time dimension to it. So if you think of something as simple as in the investing world, "I bought some shares". Well, there's a time element to that of when you bought them. Bi-temporal data for us is one that says not only when you bought those shares, but when the machine saw it. In effect, you can quite simply reproduce the state of your entire machine at any point in time. In the case of the Vodafone and the Apple shares I was referring to, I bought them yesterday or the day before and I bought them at 11 a.m. yesterday and 11 a.m. the day before. And that's the temporal aspect to that data. Now the reason for the bi-temporal nature to it is to solve exactly the problem that we've just been talking about. The first Vodafone trade I put in, I did it for 11 o'clock yesterday at eleven o'clock yesterday. And the correction I put in, I have done it for eleven o'clock yesterday at eleven o'clock this morning by putting those two timestamps onto each piece of data, I can then go back and ask the system to show me my holdings. Four or five pm last night as at 5pm p.m. last night. Or show me the holdings four or five pm last night as of now. And one of those will include the correction to the Vodafone trades and the other one won't.

Rob: Using these two additional fields to able to add an extra dimension to time. The way that that's modelled seems quite simple. Is there anything more to it than that?

Tom: It kind of is a simple solution and it tends to lead to people. Why don't all stores for temporal data, as it were, operate the way I just described? The problem is that those fields, when you get into corrections on top of corrections, actually become quite complex. So in our parlance way we describe that is the trade that was done for 11 a.m. yesterday morning has an effective time of 11:00 a.m. yesterday. And the time the machine saw it is the as at time so effective, 11:00 a.m. as at 11:00 a.m. and then when you put the correction in, it's still effective for 11:00 a.m., but as at 11:00 a.m. on the next day. In that regard, what you can do is use those two queries. So say effective 11 a.m. as at 11:00 a.m. and effective 11:00 a.m. as it later will give you the difference between those two entities. When you start to put those columns onto a database, you need four of them, because if I make two corrections, one is as a T to T plus one in the next one's as at T plus one to infinity. And you can see even by the way I'm describing this, very quickly you're going to end up into very complex store procedures and quite involve database logic. When we wrote the system the first, I guess two, three iterations of bi-temporal data stores we wrote, we followed that paradigm until we realized that actually when we wanted to do it across the whole system, we needed to abstract that into a more general view and store in the databases. So we spent a lot of time putting in triggers and views and making the data stores automatically know about bi-temporal data and have the average database developer not have to worry themselves about anything more than pushing in the updates they would have done previously.

Rob: The way you're describing it really brings out the complexities that are involved in building this. But I wonder, when did you discover that there were these extra complexities?

Tom: As always, what went wrong, right. So the problem with solving an issue like this is that the edge cases are hair-brained complex. So you design it, you write a unit test cases, you're sitting there and you know, happy with yourself, you've just cracked a really difficult problem. And then somebody comes up to you and says, guys, the month end statement I just sent out was incorrect. This literately happened to us and went "no it can't be, you know it's a by temporal system, it's guaranteed to reproduce everything" and then six to seven days of like hardcoded debugging later you discover that actually there's an edge case in one of the views of the triggers that you've done. And the clients deeply unhappy with you and everybody's stressed out to the max. And then we realize that actually this is almost too complex a problem to be solving. The way we were going about it with database views and triggers and abstracting all the logic into a general sort of case just turned out to have only one or two edge cases. But there were so difficult to debug, we thought we couldn't live with this forever.

Rob: You've come across these really difficult edge cases that you need to solve. How do you work through the problem and arrive at a solution?

Tom: Yeah, so our engineers basically sat in a dark room for about two weeks thinking about are there any other paradigms that we can bring to bear on this thing? And we I can't even tell you how it happened because it's about three years ago. But suddenly one of our lead engineers came up with the idea of using event sourcing to solve this paradigm. And as soon as we saw it, we just were like, "oh, eureka". It's one of those moments where you of course, "why hasn't everybody thought of using that as a paradigm until now?" It was just such a neat solution.

Rob: Event sourcing provides a way for the events that come into a system that changes the state or then provide notifications to other parts of the system. What part of events or sourcing was particularly attractive in your situation?

Tom: So if you think of it logically about you know, I've described a world in database land where I have a portfolio and a transactions table and it's linked to an instruments table and off to a prices table. And you have this world where everyone's familiar with how you write stored procedures and get that data out. What we looked at in terms of event sourcing was like a portfolio and the history of the trading activity in it happens as a nice event sequence. So logically, you think there's nothing kind of wrong there, right? I did those trades at 11 a.m. yesterday and I did a new one today. And as long as I have something that knows how to read through that sequence of events and build you back up into a portfolio state, then off you go. That's really, really nice. It's an elegant solution, but most people shy away from it because their view is that, well, now all you've done is moved all the complexity into the algorithm you're writing in your code instead of just leaving in the database as a select statement. Reason why it was attractive to us is when it comes to this bi-temporal data aspect of it. If you stamp each event with these two timestamps, then you naturally have, in effect, selectors over that sequence of events in both effective and as at space. And you now can worry about the subsystem giving you back the right series of events and the writer of the builder just worrying about how to chain those events together to give you the right holdings. At which point we've taken a very, very complex problem in database world and stored procedure world and written it into a substrate where the developer all they need to do is not worry about the ordering of event sequences and just worry about how to build them. And that was literally the thing we wanted to do. We wanted to stop every developer having to worry about am I correct in effective and as at space and just worry about how to build the sequence of events.

Rob: Okay, let's take a look at the implementation and let's start the data. What data storage you use?

Tom: We take advantage of Aurora from AWS. And even though what I've just described to you says that we're not very relational database heavy, we still store it in a relational database. We started off in Postgres natively, but we found that we ran into some problems around indexing that you guys had already solved in Aurora so we switched that service. And the reason we do that is we want synchronous replication across availability zones and asynchronous replication to other geographies through BCP. There's nothing that naturally needs a database or relational database about our design, but we do want those guarantees and there was no point in us reinventing every wheel. We've done enough reinvention as part of this algorithm and a few others we've come up with. It also gives us throughput statistics to the disk that performs much better than something we looked at like Cassandra cluster or even S3. Using Aurora on AWS has given us just remarkably good throughput.

Rob: This is quite unusual because normally we suggest to pick the best database tool based on your access patterns and the data that you're doing in there. And you mentioned that the data is not really relational in nature. So what was the metric that led you to Aurora vs. and no-SQL or different types of data stores?

Tom: A lot of it was about the testing that we performed. If we go back to the paradigm I've been explaining to you is one where you have a long effectively event history in your entities or portfolio. But that's not the only access pattern we have. We have ones where you don't have a long event history and you typically have a larger event payload. And we also have patterns where it's sort of in the middle of that. And indeed we need to solve patterns for time-series. And we found the best general solution for us was Postgres. If we look at a time series, it leads itself very nicely to a solution where you put three hundred and sixty six days of data into an array and Postgres when you look at things with a big payload. The JSON store in Postgres worked very well for that. And then when you look at long event sequence, well actually it turns out the Postgres performs very well for that too, when you use it on Aurora. So when you use it in native Postgres we tried a variety of the hash index is the B-tree indices and a number of the slightly more esoteric ones, and we ran into an awful lot of issues that it turns out when we spoke to the engineers at AWS that you guys ran into as well. When we discovered we were solving more or less the same problems we just moved straight to Aurora. We set up a test harness and we were able to get of the order of 10000 concurrent reads per second out of our test harness from the installation we had on Aurora.

Rob: Ok. This gives a really good picture of what data store you're using and why you are choosing to use it. Now let's talk about the data model you mentioned as at and effective at as information you need to get out of this. Is that what you're storing or is it something else?

Tom: No, we store the effect of time and the as at time on every event sequence. And what we need to know in order to be able to return this sort of built payload very efficiently. I'm going to attempt to draw a picture if you consider that the natural view for how these events would move in effective date and as a date, you'd see a diagonal alongside of an X / Y axis. If we put an effective date on the X axis and we put as at date on the Y axis, you'd expect a diagonal coming out of the origin. And what we need to do is, rather than going back to the origin and building the 150 million event history every time we want to return to you what you hold. We can just have most of it pre-built until you hit a historic correction. And what happens is when you hit a correction to the timeline, you need to move somewhere else because some of the things along that diagonal are usable and some of them are not. And in that regard, what we've done is basically come up with an algorithm that knows when a timeline becomes sealed in as at time and when you need to start promoting caches off the previous timeline onto the new one. So that when somebody asked you a question in as at space, you know immediately what timeline you're on and you can quickly go and add any new events to it to return the build state. Our clients need us to return built portfolios at a rate of 6000 per second. So this isn't something where every time we get asked a question, we can go back along the timeline and re-add up all the events.

Rob: That's a big number of events to process. It sounds like you're not doing the processing inside the database. Does that mean that you're doing it inside code? And also you spoke about a substrate. How do these things all connect?

Tom: Yeah, it's happening in code. So if you consider the database, we look at it as somewhere we can get some guarantees from. So Aurora gives us guarantees of synchronous replication, asynchronous replication across geographies, and it allows us to efficiently layout the data on disk so that when we do need to get the event payloads back, we can get them back very, very fast into the middle substrate. The middle substrate is the bit where when you ask it "can I have the portfolio effective last night, as at last night" it immediately goes and says, all right, these are the events that you need. And then it goes to the data store and returns that series of events to the builder. The coder who's responsible for building the entities then has to take that sequence of events, build it together and return it back to the API layer as fast as possible. Now ideally the coder doesn't want to be doing too much work, but they need to do some. They need to take the built entity from the last cache point and add on the new events to it and then return it up to the consumer. And all of that needs to happen in milliseconds so that the user is not sitting there and experiencing delays.

Rob: The standard advice developers are given when interacting with a database is to return the smallest subset of data possible, use really tightly defined select statements. Yet if you're doing all the processing inside of the application layer in code, then going to have to be returning larger quantities of data. How are you able to do this and still maintain this high level performance?

Tom: If you consider about how the caching structure works for us, some of the caching is alongside an entity that's pre-built. So the portfolio is a great example. Even if you're changing data in the portfolio intraday, you typically change the data for yesterday much less frequently, and you change the data for last week even less frequently and last month even less frequently, and last year rarely. What you can naturally do there is have pre-built portfolios from each of those points. And then if somebody makes a historic correction, you need to invalidate very few of them. And then what you'd expect is that you'll return from the substrate only the last cache point that's relevant to you and the marginal events since then. So when somebody asks you, can I have the latest portfolio, you'll sit there and say "right, here you go, here's the one I built from last night and here's the ten trades I saw today" add those on to the end of the portfolio and return it to the consumer. Now what you would also need to do over the top of that and what we do is make sure that you're sending the same request for the same portfolio with the same node. So don't build it every single time somebody asked for it, only build it when it's changing.

Rob: How do these nodes work? Are they the API endpoints?

Tom: So our API works across a number of layers as you'd expect. One of the things we need to be very careful of is that entitlements are enforced next to the data sometimes make that mistake.

Rob: What is enforcing entitlements close to the data mean?

Tom: Here's a good way, and unfortunately, this is one of those things where I'm about to enter a semi religious war when it comes to talking about where entitlements need to happen. If you consider a portfolio where I own one unit of the FTSE as an example. I go and ask the system "what do I own?" And then it comes back and tells me I own one unit of the FTSE. But what I now need to do is say "tell me how much I am exposed to industrials" and what I have in my portfolio along with the one unit of the FTSE is I own some BP shares. I know that that's in the FTSE too, but just humor me for a minute. And then the BP obviously wouldn't be in the industrials it would be in oil, for a reason I've just made up. But in the FTSE itself there are some industrial stocks. Now what I want to do in order to figure out how much industrials I own is I need to look through to see what's actually in the FTSE, like what shares are in the FTSE 100, now I might not have a license to do that. So if I go back and say to the system, "can I read my portfolio?" well of course it's my portfolio. But can I see how much I own in industrials will answer could be yes or could be no. Because maybe I have the rights to see it on a classification basis, but I don't have the rights to see the contents of the FTSE because I haven't paid FTSE a license to see those constituents. From our perspective the only place that really knows the answer to that question is a mixture of the entitlements engine and the portfolio engine. Because when it gets the FTSE, it needs to go and break it down into what the FTSE owns. So if you attempt to do that at the outer API layer, what happens is you have a very chatty window between the API and the entitlement system. Instead, our belief is that that should be done close to the data. And in fact, we've seen that it will make for some of our clients request to the API that might take half a second into something that could take 20, 30, 40 seconds, which is an unacceptable level of performance.

Rob: So you're choosing where to process the data. And then what does this look like? Can you describe your architecture?

Tom: Absolutely. So the Web API tier we make sure that everything's safe. We rely on Shield Advance for denial-of-service. We obviously rely on a lot more of AWS security features like Guard Duty, which we could very highly recommend to people. And then what we do is we run container's for the API underneath the Kubernetes cluster to make sure that every request comes in and gets validated and then we figure out where it's got to go after it's been validated. In terms of where it's got to go, it goes to this middle tier which allows us to figure out which portfolio you're trying to look at and what events need to be built. And that middle tier is the thing that returns your latest holdings or indeed the prices you've asked for. And then underneath that, we use a variety of data stores. So we use S3, we use Aurora, Redshift, we use you name it probably because as everyone out there will know there's no perfect single storage solution for all the variety of data types that that you need.

Rob: I want to rewind back to where you were talking about how you calculated data. But also if there was any corrections then you would have started that calculation at the same point in time. I'm kind of curious when is this done? Is this done as some sort of asynchronous task every time there's any correction comes in or is it done on demand when someone's asking for it?

Tom: The upside of bi-temporal data is that you can never give anyone the wrong answer. That's one of the most appealing things to it. So if I go back and ask you "what did the system look like last night?" well there's only one correct answer to that question. And even if I'm not using that anymore, the caching just needs to be a simple least recently used. So if nobody's asking you anymore what the data looked like last week, any built entity you have that answers that question, just falls out the bottom of the cache. In terms of the heuristics we need to apply in the complex caching mechanism, we don't really need any. All we need is the substrate to go right, "if you've asked what that look like as last night, you need these events and if you ask what it looks like as of now, you need these ones". We've in a way taken out all the typical caching complexity and left it to which events do you need to do? OK, go build it. And it solves so many problems when it comes to cache invalidation for us.

Rob: That's amazing because it sounds like a relatively simple explanation to a very difficult problem to solve.

Tom: Yeah. Two hard problems right.

Rob: Yeah absolutely. So then sticking with your solution one thing that's noticeable about what you've built, it's a very specific solution to a problem that you're facing. And if I'm a developer that wants to consume your product, how much do I need to know about the way you've implemented it and even what bi-temporal data is?

Tom: Well, ideally they don't basically. So so it's one of those things right, we've built effectively a data substrate that most people shouldn't really know about. All they know is that if they ask our system "what are the portfolio look like last night?" we'll give them the right answer. The same, we've built a portfolio accounting engine but we don't really expect people to know how to build a portfolio accounting engine or a valuation system or indeed how to deal with complex market data. They just need to know that when they come to us we'll have tutorials, examples, a engineering team that's approachable and friendly and willing to work with everyone. And that's kind of the culture we're trying to build. It's one of those things where we've solved some hard problems. But you know what it's all about making the user of our APIs and our software development kits productive.

Rob: Are there any limits on the number of requests or performance considerations that as a developer, I should be aware of?

Tom: We stand behind our four nines availability, which is again unusual in this sphere. We expect that the throughput you'll get from the system will be more than adequate for any use case that you want other than high frequency trading, that needs to be co-located with an exchange. We're not in that business, but we have some customers who can ask us for the contents of a portfolio at a rate of 10000 request per second. For most use cases that we've seen that's more than enough.

Rob: And when customers are requesting these large volumes, say ten thousand requests a second, what do you need to scale from an infrastructure point of view?

Tom: Yeah, absolutely. We have to solve challenges like we need read replicas in Aurora as an example. Because irrespective of what other limitations you talk about, there's a limit of how fast you can get data off disk. And that kind of is the rate limiter for the performance of the system. It's how fast can we figure out whether or not you're entitled to see the data and how fast can we get the data off disk to you. We're disk bound and network bound in that regard. But thankfully with read replicas in Aurora and the ability to scale across our Kubernetes cluster is just a question of pouring more hardware and resources at it.

Rob: Is there anything extra that you need to communicate to developers so they can understand and know what your system does and is?

Tom: Typically, what we have to do is sit down with people and explain to them why this is really important. And once they get it, then tell them not to worry about it, which is a bit of a paradox I guess. But there's a number of things that are very important. One it needs to be correct in terms of open API standards, we've chosen REST and in that regard consistency of access is very important. I guess the next thing is that almost irrespective of how good our engineering is and how good we think that we are building systems like this, it's up to our customers to find it useful and be productive as a result of it. So what we've done is produce software development kits in five languages, loads of examples on GitHub, tutorials on our website, and most importantly, we listen to people and whatever feedback they give us in terms of it being usable or not, we act on. So that's kind of the secret. It doesn't matter how good your kit is if people don't find it makes their development experience better.

Rob: Lots of systems store data and attach time to it. And I've built something like this a few times in the past. Data mostly comes from the world or an external system. This gives two may be similar, but still different timestamps to think about. Tom calls these as at and effective at. As at is when the system saw an event, so when it arrived or when it was stored. In contrast effective at is when it happened in the external system or out in the real world. As a side note, I'm skipping over things like clock drift, how dependant systems handle different times, a whole set of complications with distributed systems and probably a few more concerns you definitely want to think about if you need to build an accurate timestamp. But with these two fields you can look at your system, state and external state at any point in the past, which is good. But you need to think about updates or corrections. So take an update that is correct and a value from the past. To preserve the original values, you need to maintain a history an audit table or in Finbourne's case event sourcing. Stores all the events in order with both as at and effective at making querying state at a moment in time much easier. Two timestamps and event sourcing solves a really difficult problem for Finbourne and gives them consistency guarantees that makes it easier for downstream developers to build on top up. But this wasn't how they started. It was after a few iterations and a long time they came to this solution. They solve the problems that were in front of them at each step. Only when hitting an issue that couldn't be solved with their current implementation, they took a look at designing for the next level of complexity. Finbourne's approach for building for the current set of problems means they are efficient in the now. They engineer for what the customers are asking for in the short term, and sure they'll need to rebuild some time in the future. But designing for an unknown future is impossibly hard this way that customers get the immediate needs met and in the future their problems and solutions will start to become clearer. Let's get back to Tom to hear about what he's learned while building Finbourne and some of their best practices. You've got really deep into how to represent time, and many systems could benefit from storing and querying these multiple dimensions of the time data. And it sounds like an interesting solution to implement, but building it is not so trivial. So do you have any guidance on when you should build a bi-temporal system or when not to?

Tom: Yeah, I guess there's no real general guidance, but it's one of those things where you'll kind of know. From our perspective we wrote this in the simplest way possible for the first two or three iterations. In the end, we ended up changing the paradigm to event sourcing and investing a huge amount of time and resource in it. But we did it because it was paying off. It's one of those things where bi-temporal data and the way we've talked about it today sort of seems like an interesting problem to solve. But really what it does is gives you certainty when you ask our API or our system a question. One of the hardest things that I've had to do in my career was get stuff signed off, when there's data involved. I need to create a UAT and a UAT-2 environment and baseline them with backups and restores and apply a code change to one of them and make sure that nobody's messing around with any of the data in either the environments. And then once I've done that for two or three days and we try to roll it out to production, and then when it goes to production, you discover you've missed something. Bi-temporal data and the ability to ask the system "what exactly did you look like last week" allows me to know that I'm going to get a certain answer and from that certain answer, I can have all my functionality very quickly. That's really why we've invested this kind of time in it, because it's not one of those things that you should invest time in unless you know it's going to pay off the building.

Rob: The solution is certainly a significant amount of effort and jumping to that is difficult because you don't know where you need to be. Going through these iterations is also an amount of effort and some of it might be thought of as wasted, if you exclude the learning that you get. Was there any stage during these iterations where you thought, actually, you know, if we look back now, we could say maybe we could have skipped that phase? Or even with hindsight, you're happy with the approach you took?

Tom: It was about right, it was the lowest cost way of doing it. I mean people have almost the problem you're alluding to, which is you look back in time and go "oh, God, I wish I'd done this differently", well to some extent we don't. We had to go through that learning to figure out what the right solution was. We've also written a portfolio accounting engine that's a one time pass through of all events and it allows you to drop daily performance at the bottom of it. But that's another thing that we discovered on the fifth iteration of writing it. It's one of those things where you know you look back and go, well GitHub or Git the best source control system, but you have to kind of go through subversion and CVS and all the previous incarnations of source control before you figured out that actually Git is the right way of doing it. And you know what, people might go Git isn't the right way of doing it there's a better version out there. But until you've been through that pain, you don't know what the right answer is.

Rob: A big thanks to Tom and his team for sharing how they think about the multiple dimensions that time can have and a look into their implementation. Tom's team are really open and willing to talk to developers in the community about bi-temporal data. So if you want to find out more or get in touch with them, check out their organization on github.com/finbourne or lusid.com with an S to find out about their API and talk to their team. If you're excited about building the next big thing or you want to learn from the engineers that have been there and done that, subscribe to startup engineering wherever you get your podcasts. And remember to check out the show notes for useful resources related to this episode. Until the next time, keep on building.