So you want to start with xAPI? Wonderful! Now comes the first hurdle: selecting a Learning Record Store. Of course we think you should choose Learning Locker, but beyond brand names, what are the key questions you should be asking of any LRS provider?
This checklist is the process we run through when working with a new client to discover more about their LRS requirements. You might not need to answer every question, and I’m sure it’s not exhaustive, but it will give you some sensible questions to think about and ask your potential suppliers / developers about.
28 Questions to Ask When Deploying an LRS
Before you get started on the technical aspects, there’s a few other things that you need to consider…
Knowing about the overall objective in adopting the xAPI will inform your answers throughout this analysis.
One of the first decisions you’ll need to make is whether you want to host your own LRS, or use an online service. An online service will be quicker to setup, probably cheaper short-term (unless your labor cost is zero) and will be tried and tested. However, you will need to be comfortable with data storage / ownership responsibilities (is all this data OK in the cloud or do you need on-premise?) and you’ll need to be comfortable with the medium / long term costs of continually paying for a service.
As you develop your application, you are likely to want to do some testing. Remember, xAPI statements are immutable. You don’t want to be poisoning your ‘live’ data with test data. At the very least, your system should be able to segregate data logically into different LRSs for different purposes. If you are consistently pushing updates, you might want to invest in a third environment, which can be used for further testing before committing. So it might be you don’t just want a single instance of an LRS; you may need two or three.
If you need On-site Deployment…
If you’ll be deploying your LRS on-site, you need to be asking yourself these questions:
The availability of the LRS can represent a single point of failure in a learning ecosystem. Where this is the case, any on-site deployment should have built in redundancy. This can be provided by using a Load Balanced setup, distributing the LRS application between multiple application servers and having the database itself located on different servers to the application itself..
Typically, most LRSs will utilise a NoSQL-style document database for storage. These are well suited to storing non-relational data like xAPI statements. They scale well and they are built for redundant running (for example, Replica Sets are a standard feature of MongoDB). But this is a different tech stack to the normal LAMP/WAMP setup. Do you have the internal capability to manage this?
How many simultaneous requests will be made of your servers? How many statements will be sent per minute? How long which each request take to process? How many queries will be made of the data? A handy tool to help you calculate this is Little’s Law.
The killer question for redundancy and backup is always how much data can you afford to lose? Of course the preferred answer is ‘none’ but that tends to be unrealistic in the face of cost / benefit analysis. In the worst-case scenario, how much data could you afford to lose and hope to recover normal operating practice?
For many circumstances putting in some failover mechanism and also doing off-site daily backups is enough. But in high-risk or testing environments, even that might not be good enough. How can you get that 24hr number down to 1 or 2 hrs without breaking the bank? How much money will you have to spend to reduce this number?
What existing disaster recovery processes are already in place? See above.
How is the server / database hardened against intrusion? Is the database to be encrypted? What about physical access controls?
If you want SaaS Deployment…
If you’re going down the Saas route, be sure to ask the following:
Typically, Software-as-a-Service is delivered as a multi-tenant application – your data will sit alongside other orgs data at rest. It will be logically separated. Is that OK? What happens if you want to go single-tenant?
Does your organisation require data to be stored in a particular geographic location? Or, perhaps more likely, are there particular areas of the world you are required to avoid your data-at-rest?
If your provider’s data centre blew up, how much data could you loose?
What existing disaster recovery processes are already in place?
How is the server / db hardened against intrusion? Is the database to be encrypted? What about physical access controls? What certifications does the provider have? What’s the contract and relationship between software provider and host?
Regardless of your deployment scenario, you’ll want to know the answers to the following questions regarding sending your data:
What systems will be used to create and send xAPI data to the LRS initially? Are they all using v1.0+ of the spec? Any known issues?
If multiple Activity Providers (APs) are in the system, you will need to ensure they are using a common identifier for users. Mailbox is most common, but isn’t very desirable – mail addresses change and they also explicitly identify a user in plaintext. It would be better to use an Account Number, a unique identifier, to be used throughout the ecosystem. If we don’t have a single identifier you will need to make sure your LRS can help reconcile users – like Learning Locker does with creating personas for all a learner’s different identifiers.
For most production systems it is not good enough to simply ‘fire and forget’ xAPI statements; they should be queued and tracked to make sure they actually get to the LRS in one piece. For example, if the LRS wasn’t available for any reason, would the Activity Provider cache/store the statement to be resent at a later time?
The easiest way to be conformant with the spec is to use standard libraries. If your AP doesn’t use these, how do they evidence that they are following best practices?
If your AP is sending a lot of statements, does it have any ability to chunk up requests to place less load on the server? It’s generally easier for a server to process 1 request with 1000 statements, than it is to process 1000 requests with 1 statement each. This is perfectly valid; the LRS will differentiate between the observed time and the stored time of statements.
Your data should be secured in transit, as well as at rest. Every time.
The answers to your data sending questions are important, as is the storage of your data so be sure that you also get answers to the following:
Your AP’s should have adopted recipes and you should have some notion of how large your audience is going to be. You can use these two numbers to ballpark estimate the amount of statements that could be stored by the system. This can be important in SaaS circumstances, where you are most often charged on the basis of data stored. See our previous post on how many statements Learning Locker can store per GB.
Statements allow for pretty much any sort of document to be ‘sent with a statement as an attachment. Will your AP’s use this? Any restrictions on file types / file size on your servers? Can these attachments be accessed outside of querying statements? Will you use a CDN?
Do you have any process for archiving old data? If not, think about it… the LRS is going to get pretty big in years 2, 3 and 4…
If your statements are stored as raw plaintext, is any personally identifiable data associated with your users? Mailbox is a likely candidate here. Avoid where possible.
Sending and storing data queries resolved, you also need to consider how you’ll share and query your data…
Will other tools consume data from the LRS in real-time? If so, how will you facilitate this? Is real-time really required, or could it be near-time?
Data retrieval can be optimised by creating new indexes based on common requests. You can know some of this in advance, based on your use case, but it can be an on-going process.
How will data be pushed to other data warehouses? Setting up a cron?
If other systems rely on the LRS to push data to them, is a queuing system in place to ensure data can be resent if it fails for whatever reason?
Get Started Today
Needs Analysis complete, it’s time to get started! At Learning Pool we offer additional LRS Enterprise services and consultancy alongside the free, open source version of Learning Locker. Find out more about your options here.
Ben Betts was one of the founders of HT2 Labs and his work with the company helped to define the ‘next generation’ of workplace digital learning platforms. Under Ben’s direction, HT2 Labs were amongst the first to put gamification into a Learning Experience Platform. They were the first to really grasp how social learning could be applied in the workplace. And HT2 Labs were the first to release an enterprise-ready Learning Record Store.
As Chief Product Officer, his focus is now on developing Learning Pool’s product portfolio and strategy. For the wider industry, he’s also focused on helping companies learn from employees’ collective experiences, on the role of self-directed learning in the workplace and on social learning, gamification and xAPI.
Get started by telling us what you need and one of our team will be in touch very soon.
+44 207 101 9383
US +1 857 284 1420
+44 345 074 4114*
US +1 844 238 5577
* call charges vary depending on your provider