findfile

module
v0.0.0-...-720a4ab Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 11, 2021 License: MIT

README ΒΆ

findfile

API-first image file text search πŸ”

About 😎

findfile is the root API implementation of the file search service.

Store, query, and manage your JPGs, PNGs, and PDFs like you're searching text documents.

Setup πŸ€“

Prerequisites

In order to work with the scripts in bin, you'll need to have the following installed:

  • jq - version jq-1.6
  • AWS CLI - version aws-cli/1.19.53 Python/3.8.10 Linux/5.11.0-36-generic botocore/1.20.53

⚠ This code has been developed locally on an Ubuntu machine and has not been tested on other systems.

Installation

For quickstart run the following command and follow the prompts.

bash <(curl -s https://raw.githubusercontent.com/forstmeier/findfile/master/bin/quickstart) | tee "quickstart-$(date +%Y%m%d-%H%M).log"  

For more in-depth usage and configuration, clone this repository, add an etc/config/config.json file (in the structure seen in the bin/create_release script), and run the scripts available in bin.

Usage πŸ₯³

The findfile application listens to file events emitted by configured target S3 buckets. It then updates the database with that file data which can then be queried by the user. Two endpoints are provided:

  • /buckets is responsible for adding and removing target buckets πŸͺ£
  • /documents is responsible for running queries against the database πŸ—‚

Below is an example buckets query to add and remove buckets.

curl -X PUT https://7z8ruudxc9.execute-api.us-east-1.amazonaws.com/production/buckets --header "Content-Type: application/json" --header "x-findfile-security-key: 6758db58-9534-4e63-8eb9-ff402f6c29d7" --data '{"add": ["new-target-bucket"], "remove": ["old-target-bucket"]}'

Below is an example documents query searching for the text "find me".

curl -X PUT https://7z8ruudxc9.execute-api.us-east-1.amazonaws.com/production/documents --header "Content-Type: application/json" --header "x-findfile-security-key: 6758db58-9534-4e63-8eb9-ff402f6c29d7" --data '{"text": "find me"}'

A successful query response will contain the bucket and key values for any files matching the query text.

Notes

A couple of caveats and potential future changes to be aware of:

  1. AWS does not currently support the correct event when deleting files through the S3 console for findfile to correctly listen to; if this is a significant issue, we can look into a solution.
  2. S3 event notifications may be introduced to the current "listening" architecture (this would likely address the above issue).
  3. The stack is not currently very configurable but it could be expanded going forward if needed.
  4. Current database implementation defaults are in order to maintain a free tier option but these can be increased if there is interest.

Contribute πŸ€ͺ

Fork this repository and send a pull request. Follow Go best practices for structure and formatting! πŸŽ‰

Directories ΒΆ

Path Synopsis
cmd
lambda/buckets command
lambda/files command
lambda/index command
pkg
db
evt
fs

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL