Materialized View Purpose[/caption] Before starting with a materialized view, let’s talk about database views.
What is a database view?
A database view is a stored set of queries, which gets executed whenever a view is called or evoked. Unlike the regular tables, the view doesn’t occupy any physical space in your hard disk but its schema and everything is stored in the system memory. It helps abstract away the underlying tables and makes it easier to work with. They can also be called as pseudo tables. Quoted from the PostgerSQL documentation.Making liberal use of views is a key aspect of good SQL database design. Views allow you to encapsulate the details of the structure of your tables, which might change as your application evolves, behind consistent interfaces.
CREATE VIEW company_manager AS SELECT id, name, email FROM companies WHERE role='manager';Now to access all the managers
SELECT * FROM company_managers;Making more use of views makes your DB design much cleaner, but here we are talking more about using Materializing views. As that would lead to the more direct performance boost.
So what is a Materialized view?
The materializing view was first introduced in oracle. But now you can find it in most database systems like PostgreSQL, MicrosoftSQL server, IBM DB2, Sybase. MySQL doesn’t have native support for it, but you can find extensions for it which would help achieve this Materialized view is also called Matview. It is a form of database view that also has the result of the query as well. Which speeds up the results because now, you don’t have to run the query to get the results, as its already there, calculated. Of course, there are cases where we can’t have this, where we need more real-time information. But while generating reports you create a matview and then later refresh the matview to get the updated reports. Things to note about matview are:- It’s read-only (pseudo-table) so you can’t update it.
- You need to refresh the table to get the latest data.
- While refreshing, it would block other connections to access the existing data from the material view, so you need to make the refresh run concurrently
So why use Materialized views in Rails?
- Capture commonly used joins & filters.
- Push data intensive processing from Ruby to Database.
- Allow fast and live filtering of complex associations or calculation fields.
How do you use it in Rails?
Well thanks to active record, it’s quite easy to use this in our code. But we need a bit of SQL as well. First, we add the migration to create the materialized views.bundle exec rails g migration create_all_time_sales_mat_viewIn the migration file, we add the SQL
class CreateAllTimesSalesMatView < ActiveRecord::Migration def up execute <<-SQL CREATE MATERIALIZED VIEW all_time_sales_mat_view AS SELECT sum(amount) as total_sale, DATE_TRUNC('day', invoice_adte) as date_of_sale FROM sales GROUP BY DATE_TRUNC('day', invoice_adte) SQL end def down execute("DROP MATERIALIZED VIEW IF EXISTS all_time_sales_view") end endOnce the view is ready , we can create the model for this at
app/models/all_time_sales_mat_view.rb
class AllTimeSalesMatView < ActiveRecord::Base self.table_name = 'all_time_sales_mat_view' def readonly? true end def self.refresh ActiveRecord::Base.connection.execute('REFRESH MATERIALIZED VIEW CONCURRENTLY all_time_sales_mat_view') end endNow we select and query the model as usual.
AllTimeSalesMatView.select(:date_of_sale) AllTimeSalesMatView.sum(:total_sale)We can’t do any
create
, save
or update
. As its a read-only table.
Creating a table with a total of million sales record for every date in the last year, gave us the following speed improvement.
Regular user system total real (976.4ms) 0.020000 0.000000 0.020000 ( 0.990258) MatiView (2.3ms) 0.000000 0.010000 0.010000 ( 0.012010)Over 10 times speed improvement, yay!!
Summarize
Good Points
- Faster to fetch data.
- Capture commonly used joins & filters.
- Push data intensive processing from Ruby to Database.
- Allow fast and live filtering of complex associations or calculation .fields.
Pain Points
- To alter table we need to write SQL
- We will be using more RAM and Storage
- Requires Postgres 9.3 for MatView
- Requires Postgres 9.4 to refresh concurrently
- Can’t have Live data
- You can fix this by creating your own MatViewTable and updating it with the latest information
References
- https://www.postgresql.org/docs/9.3/static/rules-materializedviews.html
- http://en.wikipedia.org/wiki/Materialized_view
- http://dev.mysql.com/doc/refman/5.7/en/create-view.html
- https://blog.pivotal.io/labs/labs/database-views-performance-rails
- https://www.sitepoint.com/speed-up-with-materialized-views-on-postgresql-and-rails/
Amazing post! Thank you!
One question:
Is this query AllTimeSalesMatView.where(email: ‘[email protected]’) wrong?
The materialized view only has ‘date_of_sale’ and ‘total_sale’ fields. It hasn’t email field.
Thanks for the comment. I was thinking about users when I wrote that, I have updated the code to more proper ActiveRecord Statements.