I have a task to develop a website that allows storing of data resources from other APIs and users can analyse and visualize this data. This data is sensitive and data leakage should be minimal.
I implemented data isolation through user row-level multi-tenancy [acts_as_tenant][1]
. I could not implement Schema-level multi-tenancy because there are database dependencies that will not work well with schema-level.
Users are allowed to submit read-only queries
and receive the results. Of course, I managed to parse the query strings and disallow dangerous keywords such as DROP
, INSERT
, etc.
The database stores sensitive resources, so for every query, we either filter the results user_unique_id = 'value'
or append a WHERE
or ON
clause.
But with this solution, as queries get more complex it also becomes difficult to parse the string safely to guarantee data result isolation. And if someone tries hard enough they could find data leakage loopholes because it's not possible to handle all query combinations.
I have no experience managing large applications. Id running multiple databases, simply a database per user, a recommendable implementation? Since users will not be in control of the databases, I'll have to maintain the versioning and migrations and ensure they're all identical (which sounds a bit too (code) repetitive).
How would one approach such a situation?
def query
sql = params[:query]
inspect_sql(sql)
@result = execute_statement(sql)
render json: @result
end
private
def inspect_sql(sql)
# Raise Exception for non read-only queries
end
# Execute query
def execute_statement(sql)
begin
results = ActiveRecord::Base.connection.execute(sql)
return results.to_a unless results.blank?
{ success: false }
rescue Exception
{ success: false, exception: true }
end
end