No, I ain't paying for full-text search | Implementing Full-Text Search in Firestore
It's been a long time since I haven't posted anything here. The life, the work, the hustle, everything's been going hard on me these days. I've switched my mother tongue to Satirish™. This is not my best piece of writing, so viewer discretion is a must.
My last endeavor, a website where ELT teachers can share their resources, has stuck in a phase where I needed to implement a feature that the users can search for resources with a simple search bar. It kinda looks like Google's home page.
The problem is this: The simpler things are for the end user, the more complex it gets to implement for the developer.
The model looks like this (in Typescript):
type Resource = {
createdAt: number,
uid: string,
title: string,
shortDescription: string,
longDescription: string,
files: string[],
tags: string[],
} & Model; // model is another type that has "id" field
So, the search I need to implement has to do these things:
Check if
title
contains any token in the query.Check if
shortDescription
contains any token in the query.Check if
longDescription
contains any token in the query.Check if
tags
contain any token in the query.
This is a rather complex query, and Google, being a billion dollar company, has limited functionality on Firestore. Mainly:
You can compound 30 filters at most in a single query.
There's nothing like
LIKE
like in relational databases in Firestore (are you confused yet? double snap on your face). So, you can't search a substring in a string field in Firestore.
So, Firebase docs says they can't handle a complex query (despite being owned by a multibillion peak engineering company) and suggest me to use another third-party company (multimillion this time).
Don't get me wrong. Full-text search solutions are great because they can solve fuzzy searches just like in my case.
I've checked all the solutions provided. Algolia seemed the best fit since it allowed me to do 10k searches free every month. On the other hand, I duplicate my data, sending it to Algolia just for some fancy search functionality. And the more data I store, the costlier it gets. Also, integrating another service makes testing challenging.
Another thing is, the integration with Firebase extensions takes 2 cents every month for Algolia, which is not an expensive amount, but I have a big problem: I live in the third world.
I live in the third world and people tend to vote for demagogues here, which, as a result, makes life more expensive every second.
So, as an economically-challenged personality who cannot afford a satire attitude towards life but does it anyway, my brainmeats been overheating for the last week to make it as free and functional as possible.
FREE FULL-TEXT SEARCH IN FIREBASE (not rly, let's call it semi-text search)
Remember the Resource
model? Let's throw a searchQueries: string[]
field in there:
type Resource = {
createdAt: number,
uid: string,
title: string,
shortDescription: string,
longDescription: string,
files: string[],
tags: string[],
searchQueries: string[], // you're here
} & Model;
You might be asking: What are we gonna do with this? We gon trim
, split
and toLowercase
every field we want searchable into it. Here's a sample createResource
Firebase function:
const createResource = functions.https.onCall(async (raw: AddResourceSchemaType, context) => {
if (context.auth?.uid === undefined) {
console.error("anon user call");
return;
}
// i validate and parse the data
const data = AddResourceSchema.parse(raw);
const searchQueries = [
// make title field searchable
...data.title.trim().split(" ").map(s => s.toLowerCase()),
// make tags field searchable
...data.tags.map(s => s.toLowerCase()),
];
await firestore.collection("resources").add({
// ... other fields ...
searchQueries,
} as Omit<Resource, "id">);
});
Now that we have Resource.searchQueries
field, we can filter the resources with array-contains-any operator. (Thank god we have this operator at least).
There's only one problem, though: As the docs for array-contains-any
states:
Use the
array-contains-any
operator to combine up to 30array-contains
clauses on the same field with a logicalOR
.
I can live with that, my users can live with that as well. As a matter of fact, anyone who is searching a 30-word query on the website? I call'em crazy.
That's why, my custom useSearch
hook implementation looks like this:
const useSearch = (): UseSearchReturnType => {
// other states and contexts
const [ searchParams, ] = useSearchParams();
const params = {
uid: searchParams.get('uid'),
q: searchParams.get('q'), // this is what i use
tags: searchParams.get('tags'),
}
// create a `(QueryFieldFilterConstraint | QueryLimitConstraint)[]` to use it with firestore later on
const filters = [
...(
params.uid === null
? []
: [where('uid', '==', params.uid)]
),
...(
params.q === null
? []
: [
// here i use the filter on searchQueries
where(
'searchQueries',
'array-contains-any',
params.q.split('|').slice(0, 20), // firestore limit is 30, 20 to be safe
)
]
),
...(
params.tags === null
? []
: [
where(
'tags',
'array-contains-any',
params.tags.split('|').slice(0, 5), // max 5 tags, cuz why not?
)
]
),
limit(100),
].filter((f) => f !== null).map(f => f!);
// return a widget or smn
}
And tell you what?