GraphQL Getting Past Step 3
GraphQL is becoming one of the most popular data query libraries in the world. This is a talk that I’ve converted into an article. Enjoy!
First Version of Draftbit
- Node
- React
- Postgres
- Raw SQL Queries & knex (for the most part)
- GraphQL
https://media.giphy.com/media/mIvrv5Qe0kHlu/giphy.mp4
Throwing stuff at the wall and hoping it sticks (it didn’t)
We quickly realized we weren’t doing things the right way so we started rewriting new features
Draftbit 1.1
- ReasonML on the client ✅
Draftbit 1.2
- TypeORM ✅
- TypeScript on the Server ✅
- Prisma ✅
Draftbit 1.3
- Dataloader ✅
- Thinking about structuring our GraphQL queries more efficiently ✅✅✅
- Writing Tests ✅✅✅
Good but not great
GraphQL field resolvers aren’t being used correctly. The SQL is fast but because of how we’re handling this in a nested data structure, its not necessarily taking full advantage of GraphQL.
This also meant that we were writing custom queries for every type of request we wanted, instead of being able to traverse the tree.
There are pros & cons to this (but mainly cons).
const getAppById = async (
_,
{ uuid },
{ db, user: { uuid: userId } },
info
) => {
let app;
const result = await db.raw(
`
SELECT a.*,
(
SELECT COUNT(*)
FROM app_screens a_s
INNER JOIN screens s
ON a_s.screen_id = s.id
WHERE app_id = a.id
AND s.is_deleted IS FALSE
) as num_screens
FROM apps a
WHERE a.uuid = ? AND a.is_deleted IS FALSE
`,
[uuid]
);
Much better
We’re utilizing GraphQL’s field resolver pattern correctly. Instead of having to re-write the same query, we were able to utilize Dataloader
to fetch the id from cache on every tick
t.list.field("app", {
type: AppType,
resolve: async ({ uuid }, _, { loaders }) => {
return loaders.appsByWorkspaceUuid.load(uuid);
},
});
Dataloader
https://github.com/graphql/dataloader
DataLoader is a generic utility to be used as part of your application’s data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching.
*Dataloader works by taking an array of ids and returning an array of promises that fulfill that data
function getComponents(arrayOfIds) {
const components = getComponentsById(arrayofIds)
return components
}
// ...
const componentLoader = new DataLoader(getComponents)
componentLoader.load("xyz-abc")
componentloader.loadMany([1, 2, 3])
I struggled getting this right for awhile and avoided dataloader. That was partially because we weren’t using field resolvers correctly so I wasn’t thinking about using it the right way.
GraphQL Nexus
https://github.com/prisma-labs/nexus
Declarative, code-first and strongly typed GraphQL schema construction for TypeScript & JavaScript
GraphQL Nexus helps you write your graphql schema with typescript by generating definitions alongside of your code. It’s allowed us to identify errors in the way that we return data very easily.
import { queryType, stringArg, makeSchema } from "nexus";
import { GraphQLServer } from "graphql-yoga";
const Query = queryType({
definition(t) {
t.string("hello", {
args: { name: stringArg({ nullable: true }) },
resolve: (parent, { name }) => `Hello ${name || "World"}!`,
});
},
});
const schema = makeSchema({
types: [Query],
outputs: {
schema: __dirname + "/generated/schema.graphql",
typegen: __dirname + "/generated/typings.ts",
},
});
const server = new GraphQLServer({
schema,
});
server.start(() => `Server is running on http://localhost:4000`);
GraphQL Nexus doesn’t do anything else besides constructing your graphql schema (code-first), which means that you don’t have to use their stack or servers or whatever. All it will do is generate a schema.graphql
file that you can include.
TypeORM
https://github.com/typeorm/typeorm
ORM for TypeScript and JavaScript (ES7, ES6, ES5). Supports MySQL, PostgreSQL, MariaDB, SQLite, MS SQL Server, Oracle, SAP Hana, WebSQL databases. Works in NodeJS, Browser, Ionic, Cordova and Electron platforms. http://typeorm.io
TypeORM is the ORM we use with our Postgres database. Instead of writing raw sql queries we’re able to build and join entities easily that are typed.
import {Entity, PrimaryGeneratedColumn, Column, BaseEntity} from "typeorm";
@Entity()
export class User extends BaseEntity {
@PrimaryGeneratedColumn()
id: number;
@Column()
firstName: string;
@Column()
lastName: string;
@Column()
age: number;
}
You create an entity file like this. It supports any type of database schema that you’ve got. Then all you have to do is write functions like:
const allUsers = await User.find();
const firstUser = await User.findOne(1);
const peter = await User.findOne({ firstName: "Peter", lastName: "P" });
Caching
- Database level
- ORM level
- Apollo Server level
- Apollo Client level
It’s easy to forget about all the caching solutions we’ve had in the past that have been working great. At least I’ve forgotten about lower level tools that used to be a lot more popular: Nginx, HA Proxy, etc.
These days there are modern solutions for caching that let you handle it through your application rather than outside of it.
Caching in TypeORM
Once you set cache:true
in your ormconfig, it will generate a query-results-cache
table in your databae that’ll cache operations for 1 second by default.
Caching is controlled by individual queries:
const users = await connection
.createQueryBuilder(User, "user")
.where("user.isAdmin = :isAdmin", { isAdmin: true })
.cache(true)
.getMany();
If you want to change the amount of time spent caching you can also do cache(5000)
or some arbitrary number to make that happen.
If you’re using a multi database setup then you can also cache using a cluster or redis.
Caching in Apollo
https://www.apollographql.com/docs/apollo-server/performance/caching/
Cache Hints
Cache hints are convenient to Apollo but apparently don’t conform to the GraphQL spec. Use at your own risk (we decided against it)
type Comment @cacheControl(maxAge: 1000) {
post: Post!
}
Full Response Cache
It uses memory LRU cache by default so if you’re using multiple server instances (like Kubernetes or some crazier shit) it might make sense to use Redis (again)
You can utilize this to save your session headers instead of constantly trying to unravel on every request.
Redis or MemCached Response Caching
If you’d prefer to save your cache in Redis (which is what we do) Apollo also offers caching using MemCached or Redis and its really simple to setup:
const { RedisCache } = require('apollo-server-cache-redis');
const server = new ApolloServer({
typeDefs,
resolvers,
cache: new RedisCache({
host: 'redis-server',
}),
});
Persisted Queries
https://www.apollographql.com/docs/apollo-server/performance/apq/
Persisted queries work by reducing the size of your query to a hash, saving bandwidth and improving network performance.
*You need to switch away from Apollo Boost for this to work. There is no server setup required
import { createPersistedQueryLink } from "apollo-link-persisted-queries";
import { createHttpLink } from "apollo-link-http";
import { InMemoryCache } from "apollo-cache-inmemory";
import ApolloClient from "apollo-client";
const link = createPersistedQueryLink().concat(createHttpLink({ uri: "/graphql" }));
const client = new ApolloClient({
cache: new InMemoryCache(),
link: link,
});
Using GET requests for GraphQL
GraphQL queries are usually too long for a GET request which is why we use POST, but if you use APQ you can change that and have the request cached at the CDN level.
*This doesn’t work with batching
Fly.io
Fly makes Heroku apps 50-80% faster. It’s like a CDN but for server processes & app data
Fly sets our servers up at the edge so they load a lot faster. I don’t necessarily understand all the specifics, but working with Kurt to make this a reality at Draftbit has been a great experience.
Fly also includes a redis for every server you deploy and that’s exciting on its own
GraphQL Cost Analysis
https://github.com/pa-bru/graphql-cost-analysis
This can be used to protect your GraphQL servers against DoS attacks, compute the data consumption per user and limit it.
It’s very easy to set up and add it into your middleware. It’s supported by everything from express-graphql
to apollo-server
import costAnalysis from 'graphql-cost-analysis'
const costAnalyzer = costAnalysis({
maximumCost: 1000,
})
You can customize different costs for different queries so you understand what the data consumption is per user or per request.
It’s easy to track down situations where somebody is hijacking your GraphQL server and trying to access information that maybe they shouldn’t be.
GraphQL Depth Limit
https://github.com/stems/graphql-depth-limit
Dead-simple defense against unbounded GraphQL queries. Limit the complexity of the queries solely by their depth.
Since GraphQL can recursively build a tree of your data, it’s really simple to write a query that breaks your server. By setting the depth limit you limit the risk of someone being able to do something like this:
query BreakTheServer {
users {
apps {
users {
apps {
users {
apps {
users {
apps {
}
}
}
}
}
}
}
}
}
GraphQL Rate Limit
https://github.com/teamplanes/graphql-rate-limit
A GraphQL Rate Limiter to add basic but granular rate limiting to your Queries or Mutations.
You can rate limit the amount of requests that are happening on the GraphQL level to prevent from random shit from hitting the fan 😅
We already use GraphQL Shield at Draftbit so this pattern is really easy to setup:
import { createRateLimitRule } from 'graphql-rate-limit';
const rateLimitRule = createRateLimitRule({ identifyContext: (ctx) => ctx.id });
const permissions = shield({
Query: {
// Step 2: Apply the rate limit rule instance to the field with config
getItems: rateLimitRule({ window: "1s", max: 5 })
}
});
The nice part about this is that everything lives inside the Javascript world and not the GraphQL world. It makes it easy for us to fine tune it (and it can be type-checked).
Other Useful Tools
-
toobusy-js
- https://github.com/STRML/node-toobusy
- Stay responsive under extreme load
-
rate-limiter
- https://github.com/microlinkhq/async-ratelimiter
- limit the number of calls that are taking place, backed by redis
- A tip is to use your user’s session variables but if that’s not available, you can limit it by the ip using the additional
request-ip
package
-
reason
-
OneGraph
-
token refresh
Thanks!
Thanks to Tracy, Ann, Pato and This Dot Labs for hosting. It’s been an honor speaking alongside of Shawn too who is a very well known figure in the community.