<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>optimization &#8211; redpanthers.co</title>
	<atom:link href="/tag/optimization/feed/" rel="self" type="application/rss+xml" />
	<link>/</link>
	<description>Red Panthers - Experts in Ruby on Rails, System Design and Vue.js</description>
	<lastBuildDate>Wed, 30 Nov 2016 09:22:46 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.2.7</generator>
	<item>
		<title>Materialized Views: Caching database query</title>
		<link>/materialized-views-caching-database-query/</link>
				<comments>/materialized-views-caching-database-query/#comments</comments>
				<pubDate>Wed, 30 Nov 2016 09:22:46 +0000</pubDate>
		<dc:creator><![CDATA[coderhs]]></dc:creator>
				<category><![CDATA[Database]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Materialized]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Views]]></category>

		<guid isPermaLink="false">https://redpanthers.co/?p=728</guid>
				<description><![CDATA[
				<![CDATA[]]>		]]></description>
								<content:encoded><![CDATA[<p>				<![CDATA[As a part of our database optimization series, this article is related to creating materializing views in the database.
[caption id="" align="alignnone" width="1142"]<img class="size-full" src="https://i-msdn.sec.s-msft.com/dynimg/IC709528.png" alt="Materialzied View" width="1142" height="487" /> Materialized View Purpose[/caption]
Before starting with a materialized view, let&#8217;s talk about database views.


<h2>What is a database view?</h2>


A database view is a stored set of queries, which gets executed whenever a view is called or evoked. Unlike the regular tables, the view doesn&#8217;t occupy any physical space in your hard disk but its schema and everything is stored in the system memory. It helps abstract away the underlying tables and makes it easier to work with.
They can also be called as pseudo tables.
Quoted from the PostgerSQL documentation.


<blockquote>Making liberal use of views is a key aspect of good SQL database design. Views allow you to encapsulate the details of the structure of your tables, which might change as your application evolves, behind consistent interfaces.
&nbsp;</blockquote>




<pre class="lang:pgsql decode:true">CREATE VIEW company_manager AS
SELECT id, name, email
FROM  companies
WHERE role='manager';</pre>


Now to access all the managers


<pre class="lang:pgsql decode:true">SELECT * FROM company_managers;</pre>


&nbsp;
Making more use of views makes your DB design much cleaner, but here we are talking more about using Materializing views. As that would lead to the more direct performance boost.


<h2>So what is a Materialized view?</h2>


The materializing view was first introduced in oracle. But now you can find it in most database systems like PostgreSQL, MicrosoftSQL server, IBM DB2, Sybase. MySQL doesn&#8217;t have native support for it, but you can find extensions for it which would help achieve this
<strong>Materialized view </strong>is also called <strong>Matview. </strong>It is a form of database view that also has the result of the query as well. Which speeds up the results because now, you don&#8217;t have to run the query to get the results, as its already there, calculated. Of course, there are cases where we can&#8217;t have this, where we need more real-time information. But while generating reports you create a matview and then later refresh the matview to get the updated reports.
Things to note about matview are:


<ol>
 	

<li>It&#8217;s read-only (pseudo-table) so you can&#8217;t update it.</li>


 	

<li>You need to refresh the table to get the latest data.</li>


 	

<li>While refreshing, it would block other connections to access the existing data from the material view, so you need to make the refresh run concurrently</li>


</ol>




<h2>So why use Materialized views in Rails?</h2>




<ul>
 	

<li>Capture commonly used joins &amp; filters.</li>


 	

<li>Push data intensive processing from Ruby to Database.</li>


 	

<li>Allow fast and live filtering of complex associations or calculation fields.</li>


</ul>




<h2>How do you use it in Rails?</h2>


Well thanks to active record, it&#8217;s quite easy to use this in our code. But we need a bit of SQL as well.
First, we add the migration to create the materialized views.


<pre class="lang:sh decode:true">bundle exec rails g migration create_all_time_sales_mat_view</pre>


In the migration file, we add the SQL


<pre class="lang:ruby decode:true">class CreateAllTimesSalesMatView &lt; ActiveRecord::Migration
  def up
    execute &lt;&lt;-SQL
      CREATE MATERIALIZED VIEW all_time_sales_mat_view AS
        SELECT sum(amount) as total_sale,
        DATE_TRUNC('day', invoice_adte) as date_of_sale
      FROM sales
      GROUP BY DATE_TRUNC('day', invoice_adte)
    SQL
  end
  def down
    execute("DROP MATERIALIZED VIEW IF EXISTS all_time_sales_view")
  end
end</pre>


Once the view is ready , we can create the model for this at <code>app/models/all_time_sales_mat_view.rb</code>


<pre class="lang:default decode:true">class AllTimeSalesMatView &lt; ActiveRecord::Base
  self.table_name = 'all_time_sales_mat_view'
  def readonly?
    true
  end
  def self.refresh
    ActiveRecord::Base.connection.execute('REFRESH MATERIALIZED VIEW CONCURRENTLY all_time_sales_mat_view')
  end
end</pre>


Now we select and query the model as usual.


<pre class="lang:ruby decode:true">AllTimeSalesMatView.select(:date_of_sale)
AllTimeSalesMatView.sum(:total_sale)</pre>


We can&#8217;t do any <code>create</code>, <code>save</code> or <code>update</code>. As its a read-only table.
Creating a table with a total of million sales record for every date in the last year, gave us the following speed improvement.


<pre class="lang:sh decode:true ">Regular
       user     system      total        real
     (976.4ms)  0.020000   0.000000   0.020000 (  0.990258)
MatiView
     (2.3ms)    0.000000   0.010000   0.010000 (  0.012010)</pre>


Over 10 times speed improvement, yay!!


<h2>Summarize</h2>




<h3>Good Points</h3>




<ul>
 	

<li>Faster to fetch data.</li>


 	

<li>Capture commonly used joins &amp; filters.</li>


 	

<li>Push data intensive processing from Ruby to Database.</li>


 	

<li>Allow fast and live filtering of complex associations or calculation .fields.</li>


</ul>




<h3>Pain Points</h3>




<ul>
 	

<li>To alter table we need to write SQL</li>


 	

<li>We will be using more RAM and Storage</li>


 	

<li>Requires Postgres 9.3 for MatView</li>


 	

<li>Requires Postgres 9.4 to refresh concurrently</li>


 	

<li>Can&#8217;t have Live data


<ul>
 	

<li>You can fix this by creating your own MatViewTable and updating it with the latest information</li>


</ul>


</li>


</ul>




<h2>References</h2>




<ul>
 	

<li>https://www.postgresql.org/docs/9.3/static/rules-materializedviews.html</li>


 	

<li>http://en.wikipedia.org/wiki/Materialized_view</li>


 	

<li>http://dev.mysql.com/doc/refman/5.7/en/create-view.html</li>


 	

<li>https://blog.pivotal.io/labs/labs/database-views-performance-rails</li>


 	

<li>https://www.sitepoint.com/speed-up-with-materialized-views-on-postgresql-and-rails/</li>


</ul>

]]&gt;		</p>
]]></content:encoded>
							<wfw:commentRss>/materialized-views-caching-database-query/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
							</item>
		<item>
		<title>Introduction to generating JSON using PostgreSQL</title>
		<link>/create-json-response-using-postgresql-instead-rails/</link>
				<comments>/create-json-response-using-postgresql-instead-rails/#comments</comments>
				<pubDate>Thu, 24 Nov 2016 04:38:33 +0000</pubDate>
		<dc:creator><![CDATA[coderhs]]></dc:creator>
				<category><![CDATA[Database]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[array]]></category>
		<category><![CDATA[benchmarking]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Faster]]></category>
		<category><![CDATA[generation]]></category>
		<category><![CDATA[JSON]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[standards]]></category>
		<category><![CDATA[web api]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">https://redpanthers.co/?p=663</guid>
				<description><![CDATA[
				<![CDATA[]]>		]]></description>
								<content:encoded><![CDATA[<p>				<![CDATA[

<h2>Introduction</h2>


One of the major requirements for any online business is to have a backend that either provides or can be extended to provide an API response. Building  websites with static HTML and simple jquery ajax is coming to an end. In this era, Javascript frameworks rules the market. Hence, it is a good decision for the database to support JSON, as JSON is becoming the glue that connects the frontend and backend.
Rails have an inbuilt support for generating JSON, as it&#8217;s our swiss army knife of web development, and encourages the REST URL structure . And its a good choice for building API. It is good enough to a particular point of growth. Very soon you will reach bottlenecks, where you have more requests than you can handle and you have to either spawn up more servers or use some concurrent languages like elixir, go, etc. Before we go to that scale and burn down the existing codebase, we can use database to generate JSON responses for us, which is 10 times faster in generating JSON than Rails (though more verbose).
Since PostgreSQL 9.2, the database has taken a major leap in supporting JSON. The support that PostgreSQL provides can be divided into two


<ul>
 	

<li>Storing data in JSON and JSONB format</li>


 	

<li>Generating JSON results from the query itself</li>


</ul>


In this article, we will talk about generating JSON(an introduction) from the query itself.


<h2>Getting Started</h2>


One of the advantages of using a database to generate JSON is that I have found it fast while generating smaller JSON but much more faster in generating complex JSON. (Note: The speed is in comparison with rails not with respect to the database itself)


<h3><strong>How to generate JSON</strong></h3>




<div></div>




<div>Simplest way to do that is row_to_json()
For example: Query to return user with id 1 as JSON</div>




<div>


<pre class="lang:pgsql decode:true">select row_to_json(users) from users where id = 1;
</pre>


Result:


<pre class="lang:js decode:true">{"id":1,"email":"hsps@redpanthers.co","encrypted_password":"iwillbecrazytodisplaythat",
"reset_password_token":null,"reset_password_sent_at":null,
"remember_created_at":"2016-11-06T08:39:47.983222",
"sign_in_count":11,"current_sign_in_at":"2016-11-18T11:47:01.946542",
"last_sign_in_at":"2016-11-16T20:46:31.110257",
"current_sign_in_ip":"::1","last_sign_in_ip":"::1",
"created_at":"2016-11-06T08:38:46.193417",
"updated_at":"2016-11-18T11:47:01.956152",
"first_name":"Super","last_name":"Admin","role":3}</pre>


if you want to send only some specific fields
</div>




<pre class="lang:pgsql decode:true">select row_to_json(results)
from (
  select id, email from users
) as results</pre>


Result


<pre class="lang:js decode:true">{"id":1,"email":"hsps@redpanthers.co"}</pre>


Now let&#8217;s see how to generate more complex JSON with sub JSON, and arrays.


<pre class="lang:pgsql decode:true">select row_to_json(result)
from (
  select id, email,
    (
      select array_to_json(array_agg(row_to_json(user_projects)))
      from (
        select id, name
        from projects
        where user_id=users.id
        order by created_at asc
      ) user_projects
    ) as projects
  from users
  where id = 1
) result</pre>


This would return the JSON response


<pre class="lang:default decode:true ">{"id":1,"email":"hsps@redpanthers.co", "project":["id": 3, "name": "CSnipp"]}</pre>


The issue with the above code is that it is more verbose (has more text)  when compared to a ruby code. We need to make sure that while we do a bit of sacrifice there, is worthwhile. So while working with API&#8217;s  use it only where you see a delay in JSON generation.
Similarly ,to the <strong>&#8216;array_agg&#8217;</strong> method that we used above to aggregate values to an array then to JSON, we aggregate them as JSON using <code>json_agg</code>.


<pre class="lang:pgsql decode:true">array_to_json(array_agg(row_to_json(user_projects)))</pre>


can be shortened to


<pre class="lang:pgsql decode:true">json_agg(user_projects)</pre>


&nbsp;
Since the above method of array generation can be tedious, in PostgreSQL 9.4, they have introduced a new method called <code>json_build_object</code>. Simple usage of the function can be as below


<pre class="lang:pgsql decode:true ">json_build_object('foo',1,'bar',2)</pre>


which will output


<pre class="lang:js decode:true ">{"foo": 1, "bar": 2}</pre>


Also, we can use it to build complex JSON tree by creating functions within the PostgreSQL database. Of course, as we do that, we are moving more and more logic of the code into the DB and we would need to run migrations every time when we want to update a function. So as I said before, we are sacrificing our convenience here .So we should only use this, as the complexity of our JSON generation increases.
I will be covering how to write PostgreSQL functions to help generate more complex JSON structure easier in the second part of this particle.


<h2>References</h2>


https://www.postgresql.org/docs/current/static/functions-json.html
http://bytefish.de/blog/postgresql_json/]]&gt;		</p>
]]></content:encoded>
							<wfw:commentRss>/create-json-response-using-postgresql-instead-rails/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
							</item>
		<item>
		<title>Different types of Index in PostgreSQL</title>
		<link>/different-types-index-postgresql/</link>
				<pubDate>Mon, 19 Sep 2016 07:38:45 +0000</pubDate>
		<dc:creator><![CDATA[coderhs]]></dc:creator>
				<category><![CDATA[Beginners]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[B Tree]]></category>
		<category><![CDATA[BRIN]]></category>
		<category><![CDATA[GiST]]></category>
		<category><![CDATA[Hash]]></category>
		<category><![CDATA[indexing]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[part two]]></category>
		<category><![CDATA[series]]></category>
		<category><![CDATA[SP-GiST]]></category>
		<category><![CDATA[Speeing up database]]></category>

		<guid isPermaLink="false">https://redpanthers.co/?p=549</guid>
				<description><![CDATA[
				<![CDATA[]]>		]]></description>
								<content:encoded><![CDATA[<p>				<![CDATA[This is part two of our PostgreSQL optimization series. You can read the first article where we discuss when to index <a href="https://redpanthers.co/optimising-postgresql-database-query-using-indexes/">here</a>.
PostgreSQL uses a different set of algorithm while indexing tables, each type of algorithm is good for a certain set of data. Here we will be discussing the various algorithms available and when we should be using them. (Note these are the algorithms found in PostgreSQL 9.5)


<h2>Algorithms</h2>


<strong>B-Tree (Balance Tree),</strong> is the <strong>default</strong> algorithm used when we build indexes in Rails. It keeps a sorted copy of our column, which would be our index. So if we want to find the row of the word starting with <strong>a </strong>then as soon as the words starting with a are over. It will stop searching and return null, as the index has kept everything sorted. It is good in most cases, hence it is the default algorithm used.
<strong>Hash </strong>is one of the most popular indexing algorithms. But only the equate operator works on it, thus the query planner will only use an index with a hash algorithm if we do an equal operation searching for it. Another point to note is that Hash index is not WAL (Write Ahead Log) logged, so if the database crash we can&#8217;t rebuild the index and would need to REINDEX the entire column.
<strong>GIN</strong>, <strong>Generalized Inverted Indexing</strong> are great for indexing columns and expressions that contain an array, JSON, JSONB, etc. Internally, a <acronym class="ACRONYM">GIN</acronym> index contains a B-tree index constructed over keys, where each key is an element of one or more indexed items and where each tuple in a leaf page contains either a pointer to a B-tree of heap pointers.
<strong>GiST</strong>, <strong>Generalized Search Tree</strong> isn&#8217;t a single indexing scheme but rather an abstraction that makes it possible to implement indexing schemes for new data types by providing a balanced tree structure access method. In the past building and implementing custom indexing algorithm for custom data types include an understanding of the internals of the database. With the implementation of GiST, it provides an abstraction of the internal working which can be used to build your own indexing algorithm. It uses B-Tree internally, and thus we can use GiST to index IP address, Geo Location, etc.
<strong>SP-GiST</strong>, <strong>Space Partitioned  Generalized Search Tree</strong> &#8211; as the name suggest its GiST implementation itself but instead of balance tree structure we can use one of the non-balanced tree structure such as radix tree, quadtree, k-d tree.
<strong>BRIN, Block Range Indexes </strong>are designed to handle very large tables in which the rows’ natural sort order correlates to certain column values. For example, a table storing log entries might have a timestamp column for when each log entry was written. By using a BRIN index on this column, scanning large parts of the table can be avoided when querying rows by their timestamp value with very little overhead.
&nbsp;]]&gt;		</p>
]]></content:encoded>
										</item>
	</channel>
</rss>
