<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Tech - Falabella</title>
	<atom:link href="https://falabellaindia.com/portfolio_cat/tech/feed/" rel="self" type="application/rss+xml" />
	<link>https://falabellaindia.com</link>
	<description></description>
	<lastBuildDate>Thu, 20 Oct 2022 22:13:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.8.5</generator>

<image>
	<url>https://falabellaindia.com/wp-content/uploads/2021/11/loader-90x90.png</url>
	<title>Tech - Falabella</title>
	<link>https://falabellaindia.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Google Sign in with Jetpack Compose</title>
		<link>https://falabellaindia.com/portfolio/2987-2/</link>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Thu, 20 Oct 2022 22:08:25 +0000</pubDate>
				<guid isPermaLink="false">https://falabellaindia.com/?post_type=portfolio&#038;p=2987</guid>

					<description><![CDATA[<p>This is part of a series of articles where we will explore something new related to jetpack compose API’s and in this article, we will look into how we can use the rememberLauncherForActivityResult API by implementing Google sign-in. rememberLauncherForActivityResult is basically used to get results from other activity components like launching a camera or document picker or contacts [&#8230;]</p>
<p>The post <a href="https://falabellaindia.com/portfolio/2987-2/">Google Sign in with Jetpack Compose</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></description>
										<content:encoded><![CDATA[<p id="ec98" class="pw-post-body-paragraph kc kd jf ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">This is part of a series of articles where we will explore something new related to jetpack compose API’s and in this article, we will look into how we can use the <code class="fl la lb lc ld b">rememberLauncherForActivityResult</code> API by implementing Google sign-in.</p>
<p id="ce60" class="pw-post-body-paragraph kc kd jf ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><code class="fl la lb lc ld b">rememberLauncherForActivityResult</code> is basically used to get results from other activity components like launching a camera or document picker or contacts picker based on intents, we will use the same to launch our own custom intent in composable function and get results.</p>
<figure class="le lf lg lh gt iw">
<div class="m fo l do">
<div class="li lj l"><iframe class="fk aq as ag ce" title="Support Each Other Lets Go GIF - Find & Share on GIPHY" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fgiphy.com%2Fembed%2FOIDxfuHdmcbqAOTqaY%2Ftwitter%2Fiframe&display_name=Giphy&url=https%3A%2F%2Fmedia.giphy.com%2Fmedia%2FOIDxfuHdmcbqAOTqaY%2Fgiphy.gif&image=https%3A%2F%2Fi.giphy.com%2Fmedia%2FOIDxfuHdmcbqAOTqaY%2Fgiphy.gif&key=a19fcc184b9711e1b4764040d3dc5c07&type=text%2Fhtml&schema=giphy" width="435" height="435" frameborder="0" scrolling="auto" allowfullscreen="allowfullscreen" data-mce-fragment="1"></iframe></div>
</div>
</figure>
<h2 id="458b" class="lk ll jf bm lm ln lo lp lq lr ls lt lu kn lv lw lx kr ly lz ma kv mb mc md me fw" data-selectable-paragraph="">Google Auth Project Creation:</h2>
<p id="0ea9" class="pw-post-body-paragraph kc kd jf ke b kf mf kh ki kj mg kl km kn mh kp kq kr mi kt ku kv mj kx ky kz in fw" data-selectable-paragraph="">So the first thing that we need is to <a class="au mk" href="https://developers.google.com/identity/sign-in/android/start-integrating" target="_blank" rel="noopener ugc nofollow">Configure</a> a Google API Console project that will give us the OAuth 2.0 client ID. Once you have the<code class="fl la lb lc ld b"> OAuth 2.0 client ID</code> paste into your local string XML file.</p>
<pre class="le lf lg lh gt ml bs mm mn dz ld"><span id="8bf6" class="fw lk ll jf ld b dm mo mp l mq" data-selectable-paragraph=""><<strong class="ld jg">string name="gcp_id"</strong>>553313394578-6uuv2ohetikvrqkb6nekri7en4qbca74.example.com</<strong class="ld jg">string</strong>></span></pre>
<h2 id="4075" class="lk ll jf bm lm ln lo lp lq lr ls lt lu kn lv lw lx kr ly lz ma kv mb mc md me fw" data-selectable-paragraph="">Add the Google auth SDK app dependencies:</h2>
<pre class="le lf lg lh gt ml bs mm mn dz ld"><span id="eb59" class="fw lk ll jf ld b dm mo mp l mq" data-selectable-paragraph="">implementation <strong class="ld jg">'com.google.android.gms:play-services-auth:19.2.0'</strong></span></pre>
<h2 id="40de" class="lk ll jf bm lm ln lo lp lq lr ls lt lu kn lv lw lx kr ly lz ma kv mb mc md me fw" data-selectable-paragraph="">Let’s write some more code:</h2>
<p id="c91c" class="pw-post-body-paragraph kc kd jf ke b kf mf kh ki kj mg kl km kn mh kp kq kr mi kt ku kv mj kx ky kz in fw" data-selectable-paragraph="">Here we will first declare our Google sign-in client object in MainActivity</p>
<figure class="le lf lg lh gt iw">
<div class="m fo l do">
<div class="xu lj l">
<table class="highlight tab-size js-file-line-container js-code-nav-container js-tagsearch-file" data-hpc="" data-tab-size="8" data-paste-markdown-skip="" data-tagsearch-lang="Kotlin" data-tagsearch-path="MainActivity.kt">
<tbody>
<tr>
<td id="file-mainactivity-kt-LC1" class="blob-code blob-code-inner js-file-line"><span class="pl-k">private</span> <span class="pl-k">fun</span> <span class="pl-en">getGoogleLoginAuth</span>(): <span class="pl-en">GoogleSignInClient</span> {</td>
</tr>
<tr>
<td id="file-mainactivity-kt-L2" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="2"></td>
<td id="file-mainactivity-kt-LC2" class="blob-code blob-code-inner js-file-line"><span class="pl-k">val</span> gso <span class="pl-k">=</span> <span class="pl-en">GoogleSignInOptions</span>.<span class="pl-en">Builder</span>(<span class="pl-en">GoogleSignInOptions</span>.<span class="pl-en">DEFAULT_SIGN_IN</span>)</td>
</tr>
<tr>
<td id="file-mainactivity-kt-L3" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="3"></td>
<td id="file-mainactivity-kt-LC3" class="blob-code blob-code-inner js-file-line">.requestEmail()</td>
</tr>
<tr>
<td id="file-mainactivity-kt-L4" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="4"></td>
<td id="file-mainactivity-kt-LC4" class="blob-code blob-code-inner js-file-line">.requestIdToken(getString(<span class="pl-en">R</span>.string.gcp_id))</td>
</tr>
<tr>
<td id="file-mainactivity-kt-L5" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="5"></td>
<td id="file-mainactivity-kt-LC5" class="blob-code blob-code-inner js-file-line">.requestId()</td>
</tr>
<tr>
<td id="file-mainactivity-kt-L6" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="6"></td>
<td id="file-mainactivity-kt-LC6" class="blob-code blob-code-inner js-file-line">.requestProfile()</td>
</tr>
<tr>
<td id="file-mainactivity-kt-L7" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="7"></td>
<td id="file-mainactivity-kt-LC7" class="blob-code blob-code-inner js-file-line">.build()</td>
</tr>
<tr>
<td id="file-mainactivity-kt-L8" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="8"></td>
<td id="file-mainactivity-kt-LC8" class="blob-code blob-code-inner js-file-line"><span class="pl-k">return</span> <span class="pl-en">GoogleSignIn</span>.getClient(<span class="pl-c1">this</span>, gso)</td>
</tr>
<tr>
<td id="file-mainactivity-kt-L9" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="9"></td>
<td id="file-mainactivity-kt-LC9" class="blob-code blob-code-inner js-file-line">}</td>
</tr>
</tbody>
</table>
</div>
</div><figcaption class="mr bl gj gh gi ms mt bm b bn bo cn">declare the google signin client</figcaption></figure>
<p id="065c" class="pw-post-body-paragraph kc kd jf ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">Now comes the important part where we will declare the <code class="fl la lb lc ld b"><a class="au mk" href="https://developer.android.com/reference/kotlin/androidx/activity/compose/package-summary#rememberlauncherforactivityresult" target="_blank" rel="noopener ugc nofollow">rememberLauncherForActivityResult()</a></code> in the composable function where signup UI is being added.</p>
<figure class="le lf lg lh gt iw">
<div class="m fo l do">
<div class="xv lj l">
<table class="highlight tab-size js-file-line-container js-code-nav-container js-tagsearch-file" data-hpc="" data-tab-size="8" data-paste-markdown-skip="" data-tagsearch-lang="Kotlin" data-tagsearch-path="SignupComponent.kt">
<tbody>
<tr>
<td id="file-signupcomponent-kt-LC1" class="blob-code blob-code-inner js-file-line"><span class="pl-k">val</span> startForResult <span class="pl-k">=</span></td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L2" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="2"></td>
<td id="file-signupcomponent-kt-LC2" class="blob-code blob-code-inner js-file-line">rememberLauncherForActivityResult(<span class="pl-en">ActivityResultContracts</span>.<span class="pl-en">StartActivityForResult</span>()) { result<span class="pl-k">:</span> <span class="pl-en">ActivityResult</span> <span class="pl-k">-></span></td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L3" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="3"></td>
<td id="file-signupcomponent-kt-LC3" class="blob-code blob-code-inner js-file-line"><span class="pl-k">if</span> (result.resultCode <span class="pl-k">==</span> <span class="pl-en">Activity</span>.<span class="pl-en">RESULT_OK</span>) {</td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L4" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="4"></td>
<td id="file-signupcomponent-kt-LC4" class="blob-code blob-code-inner js-file-line"><span class="pl-k">val</span> intent <span class="pl-k">=</span> result.data</td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L5" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="5"></td>
<td id="file-signupcomponent-kt-LC5" class="blob-code blob-code-inner js-file-line"><span class="pl-k">if</span> (result.data <span class="pl-k">!=</span> <span class="pl-c1">null</span>) {</td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L6" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="6"></td>
<td id="file-signupcomponent-kt-LC6" class="blob-code blob-code-inner js-file-line"><span class="pl-k">val</span> task<span class="pl-k">:</span> <span class="pl-en">Task</span><<span class="pl-en">GoogleSignInAccount</span>> <span class="pl-k">=</span></td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L7" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="7"></td>
<td id="file-signupcomponent-kt-LC7" class="blob-code blob-code-inner js-file-line"><span class="pl-en">GoogleSignIn</span>.getSignedInAccountFromIntent(intent)</td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L8" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="8"></td>
<td id="file-signupcomponent-kt-LC8" class="blob-code blob-code-inner js-file-line">handleSignInResult(task)</td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L9" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="9"></td>
<td id="file-signupcomponent-kt-LC9" class="blob-code blob-code-inner js-file-line">}</td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L10" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="10"></td>
<td id="file-signupcomponent-kt-LC10" class="blob-code blob-code-inner js-file-line">}</td>
</tr>
<tr>
<td id="file-signupcomponent-kt-L11" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="11"></td>
<td id="file-signupcomponent-kt-LC11" class="blob-code blob-code-inner js-file-line">}</td>
</tr>
</tbody>
</table>
</div>
</div><figcaption class="mr bl gj gh gi ms mt bm b bn bo cn">declare the activity result contract</figcaption></figure>
<p id="d635" class="pw-post-body-paragraph kc kd jf ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">Assuming we have some button to launch the google sign-in Signup screen then we can do something like below. Call the newly created <code class="fl la lb lc ld b">startForResult</code> object to launch the intent inside button onClick lambda function.</p>
<figure class="le lf lg lh gt iw">
<div class="m fo l do">
<div class="xw lj l">
<table class="highlight tab-size js-file-line-container js-code-nav-container js-tagsearch-file" data-hpc="" data-tab-size="8" data-paste-markdown-skip="" data-tagsearch-lang="Kotlin" data-tagsearch-path="Button.kt">
<tbody>
<tr>
<td id="file-button-kt-LC1" class="blob-code blob-code-inner js-file-line"><span class="pl-en">Button</span>(</td>
</tr>
<tr>
<td id="file-button-kt-L2" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="2"></td>
<td id="file-button-kt-LC2" class="blob-code blob-code-inner js-file-line">onClick <span class="pl-k">=</span> {</td>
</tr>
<tr>
<td id="file-button-kt-L3" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="3"></td>
<td id="file-button-kt-LC3" class="blob-code blob-code-inner js-file-line">startForResult.launch(googleSignInClient?.signInIntent)</td>
</tr>
<tr>
<td id="file-button-kt-L4" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="4"></td>
<td id="file-button-kt-LC4" class="blob-code blob-code-inner js-file-line">},</td>
</tr>
<tr>
<td id="file-button-kt-L5" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="5"></td>
<td id="file-button-kt-LC5" class="blob-code blob-code-inner js-file-line">modifier <span class="pl-k">=</span> <span class="pl-en">Modifier</span></td>
</tr>
<tr>
<td id="file-button-kt-L6" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="6"></td>
<td id="file-button-kt-LC6" class="blob-code blob-code-inner js-file-line">.fillMaxWidth()</td>
</tr>
<tr>
<td id="file-button-kt-L7" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="7"></td>
<td id="file-button-kt-LC7" class="blob-code blob-code-inner js-file-line">.padding(start <span class="pl-k">=</span> <span class="pl-c1">16</span>.dp, end <span class="pl-k">=</span> <span class="pl-c1">16</span>.dp),</td>
</tr>
<tr>
<td id="file-button-kt-L8" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="8"></td>
<td id="file-button-kt-LC8" class="blob-code blob-code-inner js-file-line">shape <span class="pl-k">=</span> <span class="pl-en">RoundedCornerShape</span>(<span class="pl-c1">6</span>.dp),</td>
</tr>
<tr>
<td id="file-button-kt-L9" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="9"></td>
<td id="file-button-kt-LC9" class="blob-code blob-code-inner js-file-line">colors <span class="pl-k">=</span> <span class="pl-en">ButtonDefaults</span>.buttonColors(</td>
</tr>
<tr>
<td id="file-button-kt-L10" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="10"></td>
<td id="file-button-kt-LC10" class="blob-code blob-code-inner js-file-line">backgroundColor <span class="pl-k">=</span> <span class="pl-en">Color</span>.<span class="pl-en">Black</span>,</td>
</tr>
<tr>
<td id="file-button-kt-L11" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="11"></td>
<td id="file-button-kt-LC11" class="blob-code blob-code-inner js-file-line">contentColor <span class="pl-k">=</span> <span class="pl-en">Color</span>.<span class="pl-en">White</span></td>
</tr>
<tr>
<td id="file-button-kt-L12" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="12"></td>
<td id="file-button-kt-LC12" class="blob-code blob-code-inner js-file-line">)</td>
</tr>
<tr>
<td id="file-button-kt-L13" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="13"></td>
<td id="file-button-kt-LC13" class="blob-code blob-code-inner js-file-line">) {</td>
</tr>
<tr>
<td id="file-button-kt-L14" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="14"></td>
<td id="file-button-kt-LC14" class="blob-code blob-code-inner js-file-line"><span class="pl-en">Image</span>(</td>
</tr>
<tr>
<td id="file-button-kt-L15" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="15"></td>
<td id="file-button-kt-LC15" class="blob-code blob-code-inner js-file-line">painter <span class="pl-k">=</span> painterResource(id <span class="pl-k">=</span> <span class="pl-en">R</span>.drawable.ic_logo_google),</td>
</tr>
<tr>
<td id="file-button-kt-L16" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="16"></td>
<td id="file-button-kt-LC16" class="blob-code blob-code-inner js-file-line">contentDescription <span class="pl-k">=</span> <span class="pl-s"><span class="pl-pds">“</span><span class="pl-pds">“</span></span></td>
</tr>
<tr>
<td id="file-button-kt-L17" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="17"></td>
<td id="file-button-kt-LC17" class="blob-code blob-code-inner js-file-line">)</td>
</tr>
<tr>
<td id="file-button-kt-L18" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="18"></td>
<td id="file-button-kt-LC18" class="blob-code blob-code-inner js-file-line"><span class="pl-en">Text</span>(text <span class="pl-k">=</span> <span class="pl-s"><span class="pl-pds">“</span>Sign in with Google<span class="pl-pds">“</span></span>, modifier <span class="pl-k">=</span> <span class="pl-en">Modifier</span>.padding(<span class="pl-c1">6</span>.dp))</td>
</tr>
<tr>
<td id="file-button-kt-L19" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="19"></td>
<td id="file-button-kt-LC19" class="blob-code blob-code-inner js-file-line">}</td>
</tr>
</tbody>
</table>
</div>
</div><figcaption class="mr bl gj gh gi ms mt bm b bn bo cn">button click to launch intent</figcaption></figure>
<p id="bc0d" class="pw-post-body-paragraph kc kd jf ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">And that’s it if you run it then you should be able to see the google sign in popup on click of the below button and get the results back in the <code class="fl la lb lc ld b">rememberLauncherForActivityResult</code> callback.</p>
<figure class="le lf lg lh gt iw gh gi paragraph-image">
<div class="gh gi mu"><picture><source srcset="https://miro.medium.com/max/640/1*0eazhhN6x6SKsoyDZIpXAQ.png 640w, https://miro.medium.com/max/720/1*0eazhhN6x6SKsoyDZIpXAQ.png 720w, https://miro.medium.com/max/750/1*0eazhhN6x6SKsoyDZIpXAQ.png 750w, https://miro.medium.com/max/786/1*0eazhhN6x6SKsoyDZIpXAQ.png 786w, https://miro.medium.com/max/828/1*0eazhhN6x6SKsoyDZIpXAQ.png 828w, https://miro.medium.com/max/1100/1*0eazhhN6x6SKsoyDZIpXAQ.png 1100w, https://miro.medium.com/max/600/1*0eazhhN6x6SKsoyDZIpXAQ.png 600w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 300px" data-testid="og" /><img fetchpriority="high" decoding="async" class="ce jb jc c" role="presentation" src="https://miro.medium.com/max/600/1*0eazhhN6x6SKsoyDZIpXAQ.png" alt="" width="300" height="650" /></picture></div>
</figure>
<p id="16be" class="pw-post-body-paragraph kc kd jf ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">We have been doing this previously for UIToolkit but since we are adopting Jetpack Compose slowly so many challenges come up on how to do it with Jetpack compose the same task. I hope this small article helps in bridging a small gap in adopting Jetpack Compose. Next time we will learn something new about Jetpack Compose.</p>
<h2 id="028d" class="lk ll jf bm lm ln lo lp lq lr ls lt lu kn lv lw lx kr ly lz ma kv mb mc md me fw" data-selectable-paragraph="">References:</h2>
<p><a href="https://developers.google.com/identity/sign-in/android/start-integrating">https://developers.google.com/identity/sign-in/android/start-integrating</a></p>
<div class="it iu gp gr iv mv">
<div class="mw o fn">
<div class="mx o da dx en my"></div>
</div>
</div>
<div class="it iu gp gr iv mv">
<div class="mw o fn">
<div class="nd l">
<div class="nj l nf ng nh nd ni jb mv"><a href="https://github.com/googlesamples/google-services/blob/master/android/signin/app/src/main/java/com/google/samples/quickstart/signin/SignInActivity.java">https://github.com/googlesamples/google-services/blob/master/android/signin/app/src/main/java/com/google/samples/quickstart/signin/SignInActivity.java</a></div>
</div>
</div>
</div>
<p id="d5cb" class="pw-post-body-paragraph kc kd jf ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><a class="au mk" href="https://developer.android.com/jetpack/compose/libraries" target="_blank" rel="noopener ugc nofollow">https://developer.android.com/jetpack/compose/libraries</a></p><p>The post <a href="https://falabellaindia.com/portfolio/2987-2/">Google Sign in with Jetpack Compose</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Karate Framework for API testing</title>
		<link>https://falabellaindia.com/portfolio/karate-framework-for-api-testing/</link>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Thu, 20 Oct 2022 22:04:50 +0000</pubDate>
				<guid isPermaLink="false">https://falabellaindia.com/?post_type=portfolio&#038;p=2984</guid>

					<description><![CDATA[<p>Selecting the optimum tool for API automation testing is a tricky task. If you are looking to achieve parallel testing, mock testing, data driven testing, API testing, UI automation, performance testing by integration with gatling all of the above with one open source tool, then karate framework is the most suitable option. Karate is based [&#8230;]</p>
<p>The post <a href="https://falabellaindia.com/portfolio/karate-framework-for-api-testing/">Karate Framework for API testing</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></description>
										<content:encoded><![CDATA[<p id="6aaa" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Selecting the optimum tool for API automation testing is a tricky task. If you are looking to achieve</p>
<ul class="">
<li id="5ac8" class="kp kq iu jt b ju jv jy jz kc kr kg ks kk kt ko ku kv kw kx fw" data-selectable-paragraph="">parallel testing,</li>
<li id="78d6" class="kp kq iu jt b ju ky jy kz kc la kg lb kk lc ko ku kv kw kx fw" data-selectable-paragraph="">mock testing,</li>
<li id="0f2a" class="kp kq iu jt b ju ky jy kz kc la kg lb kk lc ko ku kv kw kx fw" data-selectable-paragraph="">data driven testing,</li>
<li id="cbd8" class="kp kq iu jt b ju ky jy kz kc la kg lb kk lc ko ku kv kw kx fw" data-selectable-paragraph="">API testing,</li>
<li id="0299" class="kp kq iu jt b ju ky jy kz kc la kg lb kk lc ko ku kv kw kx fw" data-selectable-paragraph="">UI automation,</li>
<li id="9aa5" class="kp kq iu jt b ju ky jy kz kc la kg lb kk lc ko ku kv kw kx fw" data-selectable-paragraph="">performance testing by integration with gatling</li>
</ul>
<p id="de03" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">all of the above with one open source tool, then karate framework is the most suitable option.</p>
<p id="d770" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="ld">Karate is based on java. It requires jdk 1.8 and maven 3.6.X as prerequisites.</em></p>
<p id="f2ba" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="ld">Use Visual Studio code as the IDE with two plugins:</em></p>
<ul class="">
<li id="1476" class="kp kq iu jt b ju jv jy jz kc kr kg ks kk kt ko ku kv kw kx fw" data-selectable-paragraph="">Open default browser.</li>
<li id="cce4" class="kp kq iu jt b ju ky jy kz kc la kg lb kk lc ko ku kv kw kx fw" data-selectable-paragraph="">Cucumber with full gherkin support.</li>
</ul>
<p id="6502" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="ld">To get started all you have to do is copy paste the below maven archtype in your terminal:</em></p>
<p id="6648" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">mvn archetype:generate \</p>
<p id="0a78" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">-DarchetypeGroupId=com.intuit.karate \</p>
<p id="3d13" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">-DarchetypeArtifactId=karate-archetype \</p>
<p id="49b3" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">-DarchetypeVersion=1.1.0 \</p>
<p id="b67c" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">-DgroupId=com.mycompany \</p>
<p id="7854" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">-DartifactId=myproject</p>
<p id="2727" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Open the project in VS code and voila the framework structure is ready!</p>
<p id="0537" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Karate supports both java and javascript but you don’t need a prior understanding on either languages to start coding. The test cases are extremely easy to develop as it uses Gherkin keywords with BDD syntax i.e; Given, And, When, Then.</p>
<p id="1c1b" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">This makes is convenient to be understood by all types of audience be it technical or non-technical. Knowledge transition and onboarding of new resources too is a simple task.</p>
<p id="6674" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Framework Structure:</strong></p>
<p id="5c84" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">POM.xml: </strong>consists of the maven, junit and karate dependencies by default post karate installation.</p>
<p id="f2c8" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Karate-config.js: </strong>here is where you can declare the constants like, auth, base url, env, reusable code that needs to be utilized by all feature files.</p>
<figure class="lf lg lh li gt lj gh gi paragraph-image">
<div class="lk ll do lm ce ln" tabindex="0" role="button">
<div class="gh gi le"><picture><source srcset="https://miro.medium.com/max/640/1*X7MlE9DdQrGlEu1BP5ioXQ.png 640w, https://miro.medium.com/max/720/1*X7MlE9DdQrGlEu1BP5ioXQ.png 720w, https://miro.medium.com/max/750/1*X7MlE9DdQrGlEu1BP5ioXQ.png 750w, https://miro.medium.com/max/786/1*X7MlE9DdQrGlEu1BP5ioXQ.png 786w, https://miro.medium.com/max/828/1*X7MlE9DdQrGlEu1BP5ioXQ.png 828w, https://miro.medium.com/max/1100/1*X7MlE9DdQrGlEu1BP5ioXQ.png 1100w, https://miro.medium.com/max/1400/1*X7MlE9DdQrGlEu1BP5ioXQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img decoding="async" class="ce lo lp c" role="presentation" src="https://miro.medium.com/max/1400/1*X7MlE9DdQrGlEu1BP5ioXQ.png" alt="" width="700" height="854" /></picture></div>
</div>
</figure>
<p id="ce91" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">TestRunner.java: </strong>has the template for running all the feature files at once, you can also mention the tags here to run a selected group of scenarios.</p>
<figure class="lf lg lh li gt lj gh gi paragraph-image">
<div class="lk ll do lm ce ln" tabindex="0" role="button">
<div class="gh gi lq"><picture><source srcset="https://miro.medium.com/max/640/1*5UfGLYDlMDWns54PD1F4gw.png 640w, https://miro.medium.com/max/720/1*5UfGLYDlMDWns54PD1F4gw.png 720w, https://miro.medium.com/max/750/1*5UfGLYDlMDWns54PD1F4gw.png 750w, https://miro.medium.com/max/786/1*5UfGLYDlMDWns54PD1F4gw.png 786w, https://miro.medium.com/max/828/1*5UfGLYDlMDWns54PD1F4gw.png 828w, https://miro.medium.com/max/1100/1*5UfGLYDlMDWns54PD1F4gw.png 1100w, https://miro.medium.com/max/1400/1*5UfGLYDlMDWns54PD1F4gw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img decoding="async" class="ce lo lp c" role="presentation" src="https://miro.medium.com/max/1400/1*5UfGLYDlMDWns54PD1F4gw.png" alt="" width="700" height="402" /></picture></div>
</div>
</figure>
<p id="e19b" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">src/test/java: </strong>this will hold the package, which can be further divided into two folders: features and resources</p>
<p id="a0cd" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="ld">features →</em></strong><em class="ld">user.feature</em></p>
<p id="26d5" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="ld">resources → payloads → </em></strong><em class="ld">addUser.json patchUser.json updateUser.json</em></p>
<p id="3ae2" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The feature file is where you have to do the development of scenarios.</p>
<p id="b472" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The file name must have the extension “.feature”, say for example user.feature</p>
<p id="626a" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Here is an example of a feature file with two scenarios.</p>
<figure class="lf lg lh li gt lj gh gi paragraph-image">
<div class="lk ll do lm ce ln" tabindex="0" role="button">
<div class="gh gi lr"><picture><source srcset="https://miro.medium.com/max/640/1*bRVmtmMV-dQPul44qwOa2g.png 640w, https://miro.medium.com/max/720/1*bRVmtmMV-dQPul44qwOa2g.png 720w, https://miro.medium.com/max/750/1*bRVmtmMV-dQPul44qwOa2g.png 750w, https://miro.medium.com/max/786/1*bRVmtmMV-dQPul44qwOa2g.png 786w, https://miro.medium.com/max/828/1*bRVmtmMV-dQPul44qwOa2g.png 828w, https://miro.medium.com/max/1100/1*bRVmtmMV-dQPul44qwOa2g.png 1100w, https://miro.medium.com/max/1400/1*bRVmtmMV-dQPul44qwOa2g.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce lo lp c" role="presentation" src="https://miro.medium.com/max/1400/1*bRVmtmMV-dQPul44qwOa2g.png" alt="" width="700" height="697" /></picture></div>
</div>
</figure>
<figure class="lf lg lh li gt lj gh gi paragraph-image">
<div class="lk ll do lm ce ln" tabindex="0" role="button">
<div class="gh gi ls"><picture><source srcset="https://miro.medium.com/max/640/1*YfTN1AtQxCnHDAo48MqMSg.png 640w, https://miro.medium.com/max/720/1*YfTN1AtQxCnHDAo48MqMSg.png 720w, https://miro.medium.com/max/750/1*YfTN1AtQxCnHDAo48MqMSg.png 750w, https://miro.medium.com/max/786/1*YfTN1AtQxCnHDAo48MqMSg.png 786w, https://miro.medium.com/max/828/1*YfTN1AtQxCnHDAo48MqMSg.png 828w, https://miro.medium.com/max/1100/1*YfTN1AtQxCnHDAo48MqMSg.png 1100w, https://miro.medium.com/max/1400/1*YfTN1AtQxCnHDAo48MqMSg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce lo lp c" role="presentation" src="https://miro.medium.com/max/1400/1*YfTN1AtQxCnHDAo48MqMSg.png" alt="" width="700" height="525" /></picture></div>
</div>
</figure>
<p id="91af" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">You can go ahead and add multiple scenarios and multiple feature files as per your requirement.</p>
<p id="b48e" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Now you are good to do the test execution.</p>
<p id="b0d0" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Command for test execution:</strong></p>
<p id="0caf" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">1. Use below command to run all feature files in a particular test environment:</p>
<p id="cded" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">mvn clean test -Dkarate.env=QA</strong></p>
<p id="69bc" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">OR</p>
<p id="8083" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">2. Use below command to run testcases tagged as “smoke” in a particular test environment</p>
<p id="1c40" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">mvn clean test “-Dkarate.options= — tags @smoke” -Dkarate.env=QA</strong></p>
<p id="5a11" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Report:</strong></p>
<p id="c75c" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The target  karate-reports folder in the framework structure will store the summary report post test execution completion.</p>
<p id="a513" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">File name: karate-summary.html</strong></p>
<p id="edfa" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">You can right click on it and choose the “open in default browser” option and a colorful report will open stating the feature, scenarios , pass, fail, time taken to execute.</p>
<figure class="lf lg lh li gt lj gh gi paragraph-image">
<div class="lk ll do lm ce ln" tabindex="0" role="button">
<div class="gh gi lt"><picture><source srcset="https://miro.medium.com/max/640/1*V5UZAxZO4IjyJpGuhpGj_w.png 640w, https://miro.medium.com/max/720/1*V5UZAxZO4IjyJpGuhpGj_w.png 720w, https://miro.medium.com/max/750/1*V5UZAxZO4IjyJpGuhpGj_w.png 750w, https://miro.medium.com/max/786/1*V5UZAxZO4IjyJpGuhpGj_w.png 786w, https://miro.medium.com/max/828/1*V5UZAxZO4IjyJpGuhpGj_w.png 828w, https://miro.medium.com/max/1100/1*V5UZAxZO4IjyJpGuhpGj_w.png 1100w, https://miro.medium.com/max/1400/1*V5UZAxZO4IjyJpGuhpGj_w.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce lo lp c" role="presentation" src="https://miro.medium.com/max/1400/1*V5UZAxZO4IjyJpGuhpGj_w.png" alt="" width="700" height="109" /></picture></div>
</div>
</figure>
<p id="b06c" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">On clicking the feature it will expand to show all the scenarios in detail.</p>
<figure class="lf lg lh li gt lj gh gi paragraph-image">
<div class="lk ll do lm ce ln" tabindex="0" role="button">
<div class="gh gi lt"><picture><source srcset="https://miro.medium.com/max/640/1*IOOiv4MWJebfkzP9Uy0hnw.png 640w, https://miro.medium.com/max/720/1*IOOiv4MWJebfkzP9Uy0hnw.png 720w, https://miro.medium.com/max/750/1*IOOiv4MWJebfkzP9Uy0hnw.png 750w, https://miro.medium.com/max/786/1*IOOiv4MWJebfkzP9Uy0hnw.png 786w, https://miro.medium.com/max/828/1*IOOiv4MWJebfkzP9Uy0hnw.png 828w, https://miro.medium.com/max/1100/1*IOOiv4MWJebfkzP9Uy0hnw.png 1100w, https://miro.medium.com/max/1400/1*IOOiv4MWJebfkzP9Uy0hnw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce lo lp c" role="presentation" src="https://miro.medium.com/max/1400/1*IOOiv4MWJebfkzP9Uy0hnw.png" alt="" width="700" height="180" /></picture></div>
</div>
</figure>
<p id="d204" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">For failed scenario the report looks like this:</p>
<figure class="lf lg lh li gt lj gh gi paragraph-image">
<div class="lk ll do lm ce ln" tabindex="0" role="button">
<div class="gh gi lt"><picture><source srcset="https://miro.medium.com/max/640/1*yxFz1Tt9cuXbFDXP1qAHJw.png 640w, https://miro.medium.com/max/720/1*yxFz1Tt9cuXbFDXP1qAHJw.png 720w, https://miro.medium.com/max/750/1*yxFz1Tt9cuXbFDXP1qAHJw.png 750w, https://miro.medium.com/max/786/1*yxFz1Tt9cuXbFDXP1qAHJw.png 786w, https://miro.medium.com/max/828/1*yxFz1Tt9cuXbFDXP1qAHJw.png 828w, https://miro.medium.com/max/1100/1*yxFz1Tt9cuXbFDXP1qAHJw.png 1100w, https://miro.medium.com/max/1400/1*yxFz1Tt9cuXbFDXP1qAHJw.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce lo lp c" role="presentation" src="https://miro.medium.com/max/1400/1*yxFz1Tt9cuXbFDXP1qAHJw.png" alt="" width="700" height="109" /></picture></div>
</div>
</figure>
<figure class="lf lg lh li gt lj gh gi paragraph-image">
<div class="lk ll do lm ce ln" tabindex="0" role="button">
<div class="gh gi lt"><picture><source srcset="https://miro.medium.com/max/640/1*QDbE8-WHTpACQKxF8kmPBg.png 640w, https://miro.medium.com/max/720/1*QDbE8-WHTpACQKxF8kmPBg.png 720w, https://miro.medium.com/max/750/1*QDbE8-WHTpACQKxF8kmPBg.png 750w, https://miro.medium.com/max/786/1*QDbE8-WHTpACQKxF8kmPBg.png 786w, https://miro.medium.com/max/828/1*QDbE8-WHTpACQKxF8kmPBg.png 828w, https://miro.medium.com/max/1100/1*QDbE8-WHTpACQKxF8kmPBg.png 1100w, https://miro.medium.com/max/1400/1*QDbE8-WHTpACQKxF8kmPBg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce lo lp c" role="presentation" src="https://miro.medium.com/max/1400/1*QDbE8-WHTpACQKxF8kmPBg.png" alt="" width="700" height="258" /></picture></div>
</div>
</figure>
<p id="e099" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Built-in report is the best part about the framewok you don’t need any third party tool or plugin or any extra code for report generation.</p>
<p id="c810" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">There you go, framework for basic API testing is ready!</p>
<p id="b65b" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">References</strong>:</p>
<p id="bfea" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><a class="au lu" href="https://github.com/karatelabs/karate" target="_blank" rel="noopener ugc nofollow">https://github.com/karatelabs/karate</a></p><p>The post <a href="https://falabellaindia.com/portfolio/karate-framework-for-api-testing/">Karate Framework for API testing</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Build Models on Massively Big Data using Continuous/Online Learning</title>
		<link>https://falabellaindia.com/portfolio/build-models-on-massively-big-data-using-continuous-online-learning/</link>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Thu, 20 Oct 2022 22:01:39 +0000</pubDate>
				<guid isPermaLink="false">https://falabellaindia.com/?post_type=portfolio&#038;p=2980</guid>

					<description><![CDATA[<p>The world seems to be in some mad race in accumulating more data by the day. The rate of data growth is being measured in zetta bytes. With this mammothic data accumulation, comes mammothic task of managing them and use them for model building. A lot of times, it gets difficult to preview such data, let [&#8230;]</p>
<p>The post <a href="https://falabellaindia.com/portfolio/build-models-on-massively-big-data-using-continuous-online-learning/">Build Models on Massively Big Data using Continuous/Online Learning</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></description>
										<content:encoded><![CDATA[<p id="8379" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The world seems to be in some mad race in accumulating more data by the day. The rate of data growth is being measured in <a class="au kp" href="https://www.statista.com/statistics/871513/worldwide-data-created/#:~:text=The%20total%20amount%20of%20data,reaching%2059%20zettabytes%20in%202020" target="_blank" rel="noopener ugc nofollow">zetta bytes</a>. With this mammothic data accumulation, comes mammothic task of managing them and use them for model building. A lot of times, it gets difficult to preview such data, let alone do any operations on them. Below, I’ve listed a few steps on how not to get overwhelmed by this scale. I would be using the data from Kaggle — <a class="au kp" href="https://www.kaggle.com/c/avazu-ctr-prediction/overview" target="_blank" rel="noopener ugc nofollow">Click Through Rate Prediction Competition</a> for most of the examples and illustrations.</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi kq"><picture><source srcset="https://miro.medium.com/max/640/1*icrSzTLz4tIYGxl0VkOgPw.jpeg 640w, https://miro.medium.com/max/720/1*icrSzTLz4tIYGxl0VkOgPw.jpeg 720w, https://miro.medium.com/max/750/1*icrSzTLz4tIYGxl0VkOgPw.jpeg 750w, https://miro.medium.com/max/786/1*icrSzTLz4tIYGxl0VkOgPw.jpeg 786w, https://miro.medium.com/max/828/1*icrSzTLz4tIYGxl0VkOgPw.jpeg 828w, https://miro.medium.com/max/1100/1*icrSzTLz4tIYGxl0VkOgPw.jpeg 1100w, https://miro.medium.com/max/1400/1*icrSzTLz4tIYGxl0VkOgPw.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/1*icrSzTLz4tIYGxl0VkOgPw.jpeg" alt="" width="700" height="464" /></picture></div>
</div><figcaption class="lc bl gj gh gi ld le bm b bn bo cn" data-selectable-paragraph="">David and the Goliath by Michelangelo</figcaption></figure>
<h2 id="eca5" class="lf lg iu bm lh li lj lk ll lm ln lo lp kc lq lr ls kg lt lu lv kk lw lx ly lz fw" data-selectable-paragraph="">Preview the data:</h2>
<p id="e702" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">Most of us get comfortable only when we preview the data, before we start analyzing and building models . Like Andrej Karpathy says:</p>
<blockquote class="mf mg mh">
<p id="dcd2" class="jr js mi jt b ju jv jw jx jy jz ka kb mj kd ke kf mk kh ki kj ml kl km kn ko in fw" data-selectable-paragraph=""><a class="au kp" href="http://karpathy.github.io/2019/04/25/recipe/" target="_blank" rel="noopener ugc nofollow">The first step to training a neural net is to not touch any neural net code at all and instead begin by thoroughly inspecting your data. This step is critical. I like to spend copious amount of time (measured in units of hours) scanning through thousands of examples, understanding their distribution and looking for patterns. Luckily, your brain is pretty good at this.</a></p>
</blockquote>
<p id="5b9e" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">But when you are presented with a few gigabytes of data in a single file, it gets harder to open/preview the data using traditional tools like notepad++ or vim, which consumes the entire system RAM to load the files. One alternative is to use tools like <em class="mi">less </em>in UNIX, which helps us just get the preview of the data instead of loading the entire data. In Windows, there are options like EditPad Lite and Large Text File Viewer, although, I still prefer the using the <em class="mi">more</em> command from the powershell or even better, using all the benefits of Unix from <a class="au kp" href="https://docs.microsoft.com/en-us/windows/wsl/install-win10" target="_blank" rel="noopener ugc nofollow">WSL</a>. Below if the data preview by running a <em class="mi">less</em> command on the train.csv from the CTR prediction dataset:</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi mm"><picture><source srcset="https://miro.medium.com/max/640/1*i-iXpUpcnXvtUQ9t6RZLow.png 640w, https://miro.medium.com/max/720/1*i-iXpUpcnXvtUQ9t6RZLow.png 720w, https://miro.medium.com/max/750/1*i-iXpUpcnXvtUQ9t6RZLow.png 750w, https://miro.medium.com/max/786/1*i-iXpUpcnXvtUQ9t6RZLow.png 786w, https://miro.medium.com/max/828/1*i-iXpUpcnXvtUQ9t6RZLow.png 828w, https://miro.medium.com/max/1100/1*i-iXpUpcnXvtUQ9t6RZLow.png 1100w, https://miro.medium.com/max/1400/1*i-iXpUpcnXvtUQ9t6RZLow.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/1*i-iXpUpcnXvtUQ9t6RZLow.png" alt="" width="700" height="384" /></picture></div>
</div>
</figure>
<p id="ca8d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Once we preview the data visually, we at least know the basic traits of the file like the number of columns, column names (if present as a part of the header), delimiter, basic types of the fields (numeric, floating, string, etc.), presence of missing values (if you can capture it in the first few records). You could also compute the total records by <em class="mi">wc -l</em> command (word count)</p>
<h1 id="a56a" class="mn lg iu bm lh mo mp mq ll mr ms mt lp mu mv mw ls mx my mz lv na nb nc ly nd fw" data-selectable-paragraph="">Read Data</h1>
<p id="8038" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">The next logical step would be to read this data into your system. Again, given the massive data size, it would be close to impossible to read the dataset into the system RAM.</p>
<h2 id="4eab" class="lf lg iu bm lh li lj lk ll lm ln lo lp kc lq lr ls kg lt lu lv kk lw lx ly lz fw" data-selectable-paragraph="">Read Data line by line:</h2>
<p id="3997" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">One option would be to read it line by line as per the example given below. Note: I have not added any preprocessing, but one may choose to perform any processing after the read step (at the comment). Also note that, the above code will read every variable as string. One may want to do necessary type conversions in the processing step.</p>
<p id="34ef" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">This works well without running into any RAM constraints, and it takes about 70 seconds to read 5 .8 GB of data. This method will work in any data size, in any given system RAM.</p>
<h2 id="7bf7" class="lf lg iu bm lh li lj lk ll lm ln lo lp kc lq lr ls kg lt lu lv kk lw lx ly lz fw" data-selectable-paragraph="">Read Mini-Batch Data with Pandas:</h2>
<p id="98e5" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">While the above method works perfectly well on any small machine, we can see that it is pretty inefficient, as it processes line after line. This has two problems:</p>
<ul class="">
<li id="581d" class="ne nf iu jt b ju jv jy jz kc ng kg nh kk ni ko nj nk nl nm fw" data-selectable-paragraph="">We are not utilizing the complete RAM of the system and we can definitely do better on time</li>
<li id="4e8c" class="ne nf iu jt b ju nn jy no kc np kg nq kk nr ko nj nk nl nm fw" data-selectable-paragraph="">Some of the data processing may require us to see more than 1 line of data (for example finding out the distribution of a variable)</li>
</ul>
<p id="6832" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">This can be addressed by mini-batch read. One easiest way to read the data in minibatch would be to use chunksize in pandas.read_csv module as shown below.</p>
<p id="8e74" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">You could also choose the columns that you wish to read, by specifying usecols option in read_csv, thus reducing the memory consumption even further.</p>
<h2 id="4f3d" class="lf lg iu bm lh li lj lk ll lm ln lo lp kc lq lr ls kg lt lu lv kk lw lx ly lz fw" data-selectable-paragraph="">Dask</h2>
<p id="9454" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">Another alternative is to use dask, which utilizes clever distributed computing and lazy loading. This means that it uses multiple cores to read the data in parallel, and at times when when you run short of memory, it only loads the structure of the data and returns the actual data only when required. In the below example, you can see that it hardly takes a few micro-seconds to perform read_csv.</p>
<p id="a927" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The below code summaries all the three approaches:</p>
<figure class="kr ks kt ku gt kv">
<div class="m fo l do">
<div class="zi nt l">
<table class="highlight tab-size js-file-line-container js-code-nav-container js-tagsearch-file" data-hpc="" data-tab-size="8" data-paste-markdown-skip="" data-tagsearch-lang="" data-tagsearch-path="read_data">
<tbody>
<tr>
<td id="file-read_data-LC1" class="blob-code blob-code-inner js-file-line">import time</td>
</tr>
<tr>
<td id="file-read_data-L2" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="2"></td>
<td id="file-read_data-LC2" class="blob-code blob-code-inner js-file-line">import pandas as pd</td>
</tr>
<tr>
<td id="file-read_data-L3" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="3"></td>
<td id="file-read_data-LC3" class="blob-code blob-code-inner js-file-line">import dask.dataframe as dd</td>
</tr>
<tr>
<td id="file-read_data-L4" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="4"></td>
<td id="file-read_data-LC4" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-read_data-L5" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="5"></td>
<td id="file-read_data-LC5" class="blob-code blob-code-inner js-file-line">class read_data:</td>
</tr>
<tr>
<td id="file-read_data-L6" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="6"></td>
<td id="file-read_data-LC6" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-read_data-L7" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="7"></td>
<td id="file-read_data-LC7" class="blob-code blob-code-inner js-file-line">def __init__(self, file_path:str):</td>
</tr>
<tr>
<td id="file-read_data-L8" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="8"></td>
<td id="file-read_data-LC8" class="blob-code blob-code-inner js-file-line">self.file_path = file_path</td>
</tr>
<tr>
<td id="file-read_data-L9" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="9"></td>
<td id="file-read_data-LC9" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-read_data-L10" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="10"></td>
<td id="file-read_data-LC10" class="blob-code blob-code-inner js-file-line">def process_data_linebyline(self):</td>
</tr>
<tr>
<td id="file-read_data-L11" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="11"></td>
<td id="file-read_data-LC11" class="blob-code blob-code-inner js-file-line">start_time = time.time()</td>
</tr>
<tr>
<td id="file-read_data-L12" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="12"></td>
<td id="file-read_data-LC12" class="blob-code blob-code-inner js-file-line">with open(self.file_path,’r’) as f:</td>
</tr>
<tr>
<td id="file-read_data-L13" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="13"></td>
<td id="file-read_data-LC13" class="blob-code blob-code-inner js-file-line">for n,line in enumerate(f):</td>
</tr>
<tr>
<td id="file-read_data-L14" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="14"></td>
<td id="file-read_data-LC14" class="blob-code blob-code-inner js-file-line">if n > 0: #ignore the header</td>
</tr>
<tr>
<td id="file-read_data-L15" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="15"></td>
<td id="file-read_data-LC15" class="blob-code blob-code-inner js-file-line">data = line.rstrip().split(‘,’)</td>
</tr>
<tr>
<td id="file-read_data-L16" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="16"></td>
<td id="file-read_data-LC16" class="blob-code blob-code-inner js-file-line">#further process data</td>
</tr>
<tr>
<td id="file-read_data-L17" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="17"></td>
<td id="file-read_data-LC17" class="blob-code blob-code-inner js-file-line">print(” — %s seconds –” % (time.time() – start_time))</td>
</tr>
<tr>
<td id="file-read_data-L18" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="18"></td>
<td id="file-read_data-LC18" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-read_data-L19" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="19"></td>
<td id="file-read_data-LC19" class="blob-code blob-code-inner js-file-line">def process_data_dataframechunks(self,chunksize : int):</td>
</tr>
<tr>
<td id="file-read_data-L20" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="20"></td>
<td id="file-read_data-LC20" class="blob-code blob-code-inner js-file-line">start_time = time.time()</td>
</tr>
<tr>
<td id="file-read_data-L21" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="21"></td>
<td id="file-read_data-LC21" class="blob-code blob-code-inner js-file-line">df = pd.read_csv(self.file_path,chunksize=chunksize)</td>
</tr>
<tr>
<td id="file-read_data-L22" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="22"></td>
<td id="file-read_data-LC22" class="blob-code blob-code-inner js-file-line">for chunk in df:</td>
</tr>
<tr>
<td id="file-read_data-L23" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="23"></td>
<td id="file-read_data-LC23" class="blob-code blob-code-inner js-file-line">pass</td>
</tr>
<tr>
<td id="file-read_data-L24" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="24"></td>
<td id="file-read_data-LC24" class="blob-code blob-code-inner js-file-line">#further process chunks</td>
</tr>
<tr>
<td id="file-read_data-L25" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="25"></td>
<td id="file-read_data-LC25" class="blob-code blob-code-inner js-file-line">print(” — %s seconds –” % (time.time() – start_time))</td>
</tr>
<tr>
<td id="file-read_data-L26" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="26"></td>
<td id="file-read_data-LC26" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-read_data-L27" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="27"></td>
<td id="file-read_data-LC27" class="blob-code blob-code-inner js-file-line">def process_data_dask(self):</td>
</tr>
<tr>
<td id="file-read_data-L28" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="28"></td>
<td id="file-read_data-LC28" class="blob-code blob-code-inner js-file-line">dtype_dict = {</td>
</tr>
<tr>
<td id="file-read_data-L29" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="29"></td>
<td id="file-read_data-LC29" class="blob-code blob-code-inner js-file-line">‘id’:’uint64′,</td>
</tr>
<tr>
<td id="file-read_data-L30" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="30"></td>
<td id="file-read_data-LC30" class="blob-code blob-code-inner js-file-line">‘click’:’int64′,</td>
</tr>
<tr>
<td id="file-read_data-L31" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="31"></td>
<td id="file-read_data-LC31" class="blob-code blob-code-inner js-file-line">‘hour’:’int64′,</td>
</tr>
<tr>
<td id="file-read_data-L32" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="32"></td>
<td id="file-read_data-LC32" class="blob-code blob-code-inner js-file-line">‘C1′:’int64’,</td>
</tr>
<tr>
<td id="file-read_data-L33" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="33"></td>
<td id="file-read_data-LC33" class="blob-code blob-code-inner js-file-line">‘banner_pos’:’int64′,</td>
</tr>
<tr>
<td id="file-read_data-L34" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="34"></td>
<td id="file-read_data-LC34" class="blob-code blob-code-inner js-file-line">‘site_id’:’object’,</td>
</tr>
<tr>
<td id="file-read_data-L35" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="35"></td>
<td id="file-read_data-LC35" class="blob-code blob-code-inner js-file-line">‘site_domain’:’object’,</td>
</tr>
<tr>
<td id="file-read_data-L36" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="36"></td>
<td id="file-read_data-LC36" class="blob-code blob-code-inner js-file-line">‘site_category’:’object’,</td>
</tr>
<tr>
<td id="file-read_data-L37" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="37"></td>
<td id="file-read_data-LC37" class="blob-code blob-code-inner js-file-line">‘app_id’:’object’,</td>
</tr>
<tr>
<td id="file-read_data-L38" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="38"></td>
<td id="file-read_data-LC38" class="blob-code blob-code-inner js-file-line">‘app_domain’:’object’,</td>
</tr>
<tr>
<td id="file-read_data-L39" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="39"></td>
<td id="file-read_data-LC39" class="blob-code blob-code-inner js-file-line">‘app_category’:’object’,</td>
</tr>
<tr>
<td id="file-read_data-L40" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="40"></td>
<td id="file-read_data-LC40" class="blob-code blob-code-inner js-file-line">‘device_id’:’object’,</td>
</tr>
<tr>
<td id="file-read_data-L41" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="41"></td>
<td id="file-read_data-LC41" class="blob-code blob-code-inner js-file-line">‘device_ip’:’object’,</td>
</tr>
<tr>
<td id="file-read_data-L42" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="42"></td>
<td id="file-read_data-LC42" class="blob-code blob-code-inner js-file-line">‘device_model’:’object’,</td>
</tr>
<tr>
<td id="file-read_data-L43" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="43"></td>
<td id="file-read_data-LC43" class="blob-code blob-code-inner js-file-line">‘device_type’:’int64′,</td>
</tr>
<tr>
<td id="file-read_data-L44" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="44"></td>
<td id="file-read_data-LC44" class="blob-code blob-code-inner js-file-line">‘device_conn_type’:’int64′,</td>
</tr>
<tr>
<td id="file-read_data-L45" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="45"></td>
<td id="file-read_data-LC45" class="blob-code blob-code-inner js-file-line">‘C14′:’int64’,</td>
</tr>
<tr>
<td id="file-read_data-L46" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="46"></td>
<td id="file-read_data-LC46" class="blob-code blob-code-inner js-file-line">‘C15′:’int64’,</td>
</tr>
<tr>
<td id="file-read_data-L47" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="47"></td>
<td id="file-read_data-LC47" class="blob-code blob-code-inner js-file-line">‘C16′:’int64’,</td>
</tr>
<tr>
<td id="file-read_data-L48" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="48"></td>
<td id="file-read_data-LC48" class="blob-code blob-code-inner js-file-line">‘C17′:’int64’,</td>
</tr>
<tr>
<td id="file-read_data-L49" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="49"></td>
<td id="file-read_data-LC49" class="blob-code blob-code-inner js-file-line">‘C18′:’int64’,</td>
</tr>
<tr>
<td id="file-read_data-L50" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="50"></td>
<td id="file-read_data-LC50" class="blob-code blob-code-inner js-file-line">‘C19′:’int64’,</td>
</tr>
<tr>
<td id="file-read_data-L51" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="51"></td>
<td id="file-read_data-LC51" class="blob-code blob-code-inner js-file-line">‘C20′:’int64’,</td>
</tr>
<tr>
<td id="file-read_data-L52" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="52"></td>
<td id="file-read_data-LC52" class="blob-code blob-code-inner js-file-line">‘C21′:’int64’</td>
</tr>
<tr>
<td id="file-read_data-L53" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="53"></td>
<td id="file-read_data-LC53" class="blob-code blob-code-inner js-file-line">}</td>
</tr>
<tr>
<td id="file-read_data-L54" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="54"></td>
<td id="file-read_data-LC54" class="blob-code blob-code-inner js-file-line">start_time = time.time()</td>
</tr>
<tr>
<td id="file-read_data-L55" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="55"></td>
<td id="file-read_data-LC55" class="blob-code blob-code-inner js-file-line">df = dd.read_csv(self.file_path, dtype=dtype_dict)</td>
</tr>
<tr>
<td id="file-read_data-L56" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="56"></td>
<td id="file-read_data-LC56" class="blob-code blob-code-inner js-file-line">print(” — %s seconds –” % (time.time() – start_time))</td>
</tr>
<tr>
<td id="file-read_data-L57" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="57"></td>
<td id="file-read_data-LC57" class="blob-code blob-code-inner js-file-line">print(df.head())</td>
</tr>
<tr>
<td id="file-read_data-L58" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="58"></td>
<td id="file-read_data-LC58" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-read_data-L59" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="59"></td>
<td id="file-read_data-LC59" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-read_data-L60" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="60"></td>
<td id="file-read_data-LC60" class="blob-code blob-code-inner js-file-line">if __name__ == “__main__”:</td>
</tr>
<tr>
<td id="file-read_data-L61" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="61"></td>
<td id="file-read_data-LC61" class="blob-code blob-code-inner js-file-line">readObj = read_data(‘../data/train.csv’)</td>
</tr>
<tr>
<td id="file-read_data-L62" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="62"></td>
<td id="file-read_data-LC62" class="blob-code blob-code-inner js-file-line">#readObj.process_data_linebyline()</td>
</tr>
<tr>
<td id="file-read_data-L63" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="63"></td>
<td id="file-read_data-LC63" class="blob-code blob-code-inner js-file-line">#readObj.process_data_dataframechunks(chunksize=500000)</td>
</tr>
<tr>
<td id="file-read_data-L64" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="64"></td>
<td id="file-read_data-LC64" class="blob-code blob-code-inner js-file-line">readObj.process_data_dask()</td>
</tr>
</tbody>
</table>
</div>
</div>
</figure>
<h2 id="2341" class="lf lg iu bm lh li lj lk ll lm ln lo lp kc lq lr ls kg lt lu lv kk lw lx ly lz fw" data-selectable-paragraph="">Numpy memmap</h2>
<p id="6e05" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">Speaking about lazy loading, if you happen to have only numeric data, then we can alternatively make use of numpy memmap, which only maps the addresses of the data on the disk. It actually fetches the data only when it is referenced with the indices when required. (<a class="au kp" href="https://stackoverflow.com/questions/43393821/how-to-concat-many-numpy-arrays" target="_blank" rel="noopener ugc nofollow">https://stackoverflow.com/questions/43393821/how-to-concat-many-numpy-arrays</a>)</p>
<pre class="kr ks kt ku gt nu bs nv nw dz nx"><span id="602f" class="fw lf lg iu nx b dm ny nz l oa" data-selectable-paragraph="">ftrainY = np.memmap('data/trainY.npy',dtype=np.int,shape=(15727857,),mode='w+')
ftestY = np.memmap('data/testY.npy',dtype=np.int,shape=(3931965,),mode='w+')</span><span id="6e4f" class="fw lf lg iu nx b dm ob nz l oa" data-selectable-paragraph="">fread = np.memmap('data/trainY.npy',mode='r',shape=(15727857,))</span><span id="ec3a" class="fw lf lg iu nx b dm ob nz l oa" data-selectable-paragraph="">def chunker(seq, size):
    return (seq[pos:pos + size] for pos in range(0, len(seq), size))</span><span id="7d47" class="fw lf lg iu nx b dm ob nz l oa" data-selectable-paragraph="">for batch in chunker(fread,500000):
    print(batch.mean())</span></pre>
<h2 id="e904" class="lf lg iu bm lh li lj lk ll lm ln lo lp kc lq lr ls kg lt lu lv kk lw lx ly lz fw" data-selectable-paragraph="">Database based alternatives</h2>
<p id="59f7" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">Other ways to read such massive data would be to create a mini database like sqlite in your own system and use tools like sqlalchemy to read them with filters, etc. I’ll probably reserve that discussion for another post.</p>
<h1 id="4185" class="mn lg iu bm lh mo mp mq ll mr ms mt lp mu mv mw ls mx my mz lv na nb nc ly nd fw" data-selectable-paragraph="">Building Models</h1>
<p id="8212" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">Once the data is loaded, the next task would be to perform some variable reduction or some modelling. A lot of the readily available machine learning packages mostly expect the entire data to be loaded into the RAM for performing their ‘fit’ functions. Below I’ll illustrate some of the ways to ensure that the ‘fit’ happens irrespective of the data size.</p>
<h2 id="4ed0" class="lf lg iu bm lh li lj lk ll lm ln lo lp kc lq lr ls kg lt lu lv kk lw lx ly lz fw" data-selectable-paragraph="">Smarter Initialisers leads to faster convergence</h2>
<p id="c7c0" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">A lot of algorithms are sensitive to the weight initialisers. A purely random weight initialisation would take a longer duration to converge as it will have to re-adjust its weights a lot more to the optimal points. Also poor initialisation may lead to exploding/vanishing gradients. There are smarter hacks to intelligently initialise your weights before beginning to learn. For example, if we are trying to perform a regression whose mean value is say — 250, then initialise your bias to be equal to 250. Also scale your input variables before fitting, so as to reduce the range of gradients while learning. For classification problems, initialise the logit bias such that your model predicts probability equal to the 1:0 ratios at initialisation. While clustering, determine the distribution of points by various features (Eg: get the 5th, 10th, 25th, 50th, 75th, 90th and 95th percentiles of each features) and initialise the centroids around these ranges.</p>
<p id="d411" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Another alternative could be to fit a model on a very small sample and use these weights (in case of supervised models) OR centroids (in case of unsupervised models) as initialisers for the large dataset modelling.</p>
<h2 id="b52b" class="lf lg iu bm lh li lj lk ll lm ln lo lp kc lq lr ls kg lt lu lv kk lw lx ly lz fw" data-selectable-paragraph="">Partial/Incremental fit vs full fit</h2>
<p id="80bb" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">A lot of the algorithms allow stochastic learning. That means, instead of learning from the entire data, they can learn if they were provided row-by-row data. This could even be extended to a mini-batch data (a small set of rows, instead of single row). That means, in a typical supervised learning setup, we start with some randomly assigned — learnable weights, and keep adjusting those weights as and when we encounter more data and labels. Below, I shall discuss some of the methods that uses these partial fit/incremental fit —</p>
<p id="071a" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Data Scaling:</strong></p>
<p id="bd97" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The data can be across different units of measurements (like percentages, large float numbers, small ranged integers, binary, etc.). In order to bring all these variables to a comparable scale, we perform scaling. The most popular API for scaling is StandardScaler or MinMaxScaler. StandardScaler computes the Mean/Standard Deviation of the variables and subtracts the mean from each row of the variable and divides it with the Standard Deviation, thus ensuring that distribution of the variable has a mean of 0 and standard deviation of 1. i.e. —</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi oc"><picture><source srcset="https://miro.medium.com/max/640/1*kHnihwIOiqTFRtJ2uzfJPg.jpeg 640w, https://miro.medium.com/max/720/1*kHnihwIOiqTFRtJ2uzfJPg.jpeg 720w, https://miro.medium.com/max/750/1*kHnihwIOiqTFRtJ2uzfJPg.jpeg 750w, https://miro.medium.com/max/786/1*kHnihwIOiqTFRtJ2uzfJPg.jpeg 786w, https://miro.medium.com/max/828/1*kHnihwIOiqTFRtJ2uzfJPg.jpeg 828w, https://miro.medium.com/max/1100/1*kHnihwIOiqTFRtJ2uzfJPg.jpeg 1100w, https://miro.medium.com/max/302/1*kHnihwIOiqTFRtJ2uzfJPg.jpeg 302w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 151px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/302/1*kHnihwIOiqTFRtJ2uzfJPg.jpeg" alt="" width="151" height="42" /></picture></div>
</figure>
<p id="d10e" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">where μ is the mean of the variable and σ is the standard deviation of the variable.</p>
<p id="62d5" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Similarly, the MinMax Scaling is performed by taking the ratio of the difference of each row of the variable with that of the min value of the variable to the difference max value of the variable with that of each row of the variable. i.e. —</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi od"><picture><source srcset="https://miro.medium.com/max/640/0*luNnVkStGPmhBYtQ.jpg 640w, https://miro.medium.com/max/720/0*luNnVkStGPmhBYtQ.jpg 720w, https://miro.medium.com/max/750/0*luNnVkStGPmhBYtQ.jpg 750w, https://miro.medium.com/max/786/0*luNnVkStGPmhBYtQ.jpg 786w, https://miro.medium.com/max/828/0*luNnVkStGPmhBYtQ.jpg 828w, https://miro.medium.com/max/1100/0*luNnVkStGPmhBYtQ.jpg 1100w, https://miro.medium.com/max/512/0*luNnVkStGPmhBYtQ.jpg 512w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 256px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/512/0*luNnVkStGPmhBYtQ.jpg" alt="" width="256" height="46" /></picture></div>
</figure>
<p id="7646" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">However, both the StandardScaler and the MinMax Scaler in the above equations will require the entire distribution to be made available at once (in order to compute the mean, standard deviation, max and min). But we still are stuck with the big data problem of not being able to load the entire data into the memory. In order to circumvent this, we employ incremental aggregation computation methods, which will compute the mean, standard deviation, min and max incrementally. Though this may not result in accurate values, these approximations usually hold well for the large data.</p>
<p id="9fa7" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Incremental Mean:</strong></p>
<ul class="">
<li id="b265" class="ne nf iu jt b ju jv jy jz kc ng kg nh kk ni ko nj nk nl nm fw" data-selectable-paragraph="">We know that a simple mean of a variable is expressed as the sum of the variable values divided by the number of observations</li>
</ul>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi oe"><picture><source srcset="https://miro.medium.com/max/640/1*3AKwSkKiUCl0SgBmmZ5svg.png 640w, https://miro.medium.com/max/720/1*3AKwSkKiUCl0SgBmmZ5svg.png 720w, https://miro.medium.com/max/750/1*3AKwSkKiUCl0SgBmmZ5svg.png 750w, https://miro.medium.com/max/786/1*3AKwSkKiUCl0SgBmmZ5svg.png 786w, https://miro.medium.com/max/828/1*3AKwSkKiUCl0SgBmmZ5svg.png 828w, https://miro.medium.com/max/1100/1*3AKwSkKiUCl0SgBmmZ5svg.png 1100w, https://miro.medium.com/max/252/1*3AKwSkKiUCl0SgBmmZ5svg.png 252w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 126px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/252/1*3AKwSkKiUCl0SgBmmZ5svg.png" alt="" width="126" height="68" /></picture></div>
</figure>
<ul class="">
<li id="bf77" class="ne nf iu jt b ju jv jy jz kc ng kg nh kk ni ko nj nk nl nm fw" data-selectable-paragraph="">By re ordering some of the terms as given below, we can express the mean of the variable upto the n-th observation as a function of the the mean upto the n-1 th observation</li>
</ul>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi of"><picture><source srcset="https://miro.medium.com/max/640/1*bPmWAu0wnB2IGUzlOHoqVg.png 640w, https://miro.medium.com/max/720/1*bPmWAu0wnB2IGUzlOHoqVg.png 720w, https://miro.medium.com/max/750/1*bPmWAu0wnB2IGUzlOHoqVg.png 750w, https://miro.medium.com/max/786/1*bPmWAu0wnB2IGUzlOHoqVg.png 786w, https://miro.medium.com/max/828/1*bPmWAu0wnB2IGUzlOHoqVg.png 828w, https://miro.medium.com/max/1100/1*bPmWAu0wnB2IGUzlOHoqVg.png 1100w, https://miro.medium.com/max/778/1*bPmWAu0wnB2IGUzlOHoqVg.png 778w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 389px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/778/1*bPmWAu0wnB2IGUzlOHoqVg.png" alt="" width="389" height="351" /></picture></div>
</figure>
<p id="5787" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Thus, you can see that the mean for the n-th observation can be derived from the n-1 th observation. In other words, the mean of a variable can be calculated <em class="mi">incrementally</em>. Similarly, we can also show that the standard deviation too, can be calculated <em class="mi">incrementally</em>. And so is the min and max of a variable (keep a pseudo min/ max variable and keep reassigning that variable as and when you encounter the new min and new max in the variable observations).</p>
<p id="e7c7" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Using the above incremental aggregation concept, scikit learn has enabled a partial fit function for both the scaler, which keeps learning the scale function everytime you pass a chunk of data to it. So the learning can be done chunkwise. Once the entire data is fit, you can use the scaler object for further transformations. Below code shows how it can be done —</p>
<figure class="kr ks kt ku gt kv">
<div class="m fo l do">
<div class="zk nt l">
<table class="highlight tab-size js-file-line-container js-code-nav-container js-tagsearch-file" data-hpc="" data-tab-size="8" data-paste-markdown-skip="" data-tagsearch-lang="" data-tagsearch-path="preprocess">
<tbody>
<tr>
<td id="file-preprocess-LC1" class="blob-code blob-code-inner js-file-line">import pandas as pd</td>
</tr>
<tr>
<td id="file-preprocess-L2" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="2"></td>
<td id="file-preprocess-LC2" class="blob-code blob-code-inner js-file-line">from sklearn.preprocessing import StandardScaler, MinMaxScaler</td>
</tr>
<tr>
<td id="file-preprocess-L3" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="3"></td>
<td id="file-preprocess-LC3" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-preprocess-L4" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="4"></td>
<td id="file-preprocess-LC4" class="blob-code blob-code-inner js-file-line">chunksize = 100000</td>
</tr>
<tr>
<td id="file-preprocess-L5" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="5"></td>
<td id="file-preprocess-LC5" class="blob-code blob-code-inner js-file-line">stdscaler = StandardScaler()</td>
</tr>
<tr>
<td id="file-preprocess-L6" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="6"></td>
<td id="file-preprocess-LC6" class="blob-code blob-code-inner js-file-line">#Consider only numeric drivers for our example</td>
</tr>
<tr>
<td id="file-preprocess-L7" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="7"></td>
<td id="file-preprocess-LC7" class="blob-code blob-code-inner js-file-line">numeric_drivers = [‘hour’,’banner_pos’,’C1′,’device_type’,’device_conn_type’,’C14′,’C15′,’C16′,’C17′,’C18′,’C19′,’C20′,’C21′]</td>
</tr>
<tr>
<td id="file-preprocess-L8" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="8"></td>
<td id="file-preprocess-LC8" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-preprocess-L9" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="9"></td>
<td id="file-preprocess-LC9" class="blob-code blob-code-inner js-file-line">df = pd.read_csv(‘../data/train.csv’,chunksize=chunksize)</td>
</tr>
<tr>
<td id="file-preprocess-L10" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="10"></td>
<td id="file-preprocess-LC10" class="blob-code blob-code-inner js-file-line">for chunk in df:</td>
</tr>
<tr>
<td id="file-preprocess-L11" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="11"></td>
<td id="file-preprocess-LC11" class="blob-code blob-code-inner js-file-line">stdscaler.partial_fit(chunk[numeric_drivers])</td>
</tr>
<tr>
<td id="file-preprocess-L12" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="12"></td>
<td id="file-preprocess-LC12" class="blob-code blob-code-inner js-file-line">#continue preprocecssing</td>
</tr>
</tbody>
</table>
</div>
</div>
</figure>
<p id="cc5c" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Dimension Reduction:</strong></p>
<ul class="">
<li id="2092" class="ne nf iu jt b ju jv jy jz kc ng kg nh kk ni ko nj nk nl nm fw" data-selectable-paragraph=""><strong class="jt iv">Principal Component Analysis</strong> is typically used to reduce the large dimension to smaller components, where each of these components are expressed as a linear function of all the underlying features, while ensuring that each of these linear components are orthogonal to each other. I would not delve more into the working of PCA as such, but would point out that there exists an IncrementalPCA API in the sklearn library. This takes in the data batch-by-batch and incrementally fits the Components, which turns out to be pretty useful while training a large data. Below is an example usage of IncrementalPCA on the same large data example. Note that we also scale the data before we fit the PCA.</li>
</ul>
<figure class="kr ks kt ku gt kv">
<div class="m fo l do">
<div class="zj nt l">
<table class="highlight tab-size js-file-line-container js-code-nav-container js-tagsearch-file" data-hpc="" data-tab-size="8" data-paste-markdown-skip="" data-tagsearch-lang="" data-tagsearch-path="incrementalpca">
<tbody>
<tr>
<td id="file-incrementalpca-L1" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="1"></td>
<td id="file-incrementalpca-LC1" class="blob-code blob-code-inner js-file-line">import pandas as pd</td>
</tr>
<tr>
<td id="file-incrementalpca-L2" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="2"></td>
<td id="file-incrementalpca-LC2" class="blob-code blob-code-inner js-file-line">import numpy as np</td>
</tr>
<tr>
<td id="file-incrementalpca-L3" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="3"></td>
<td id="file-incrementalpca-LC3" class="blob-code blob-code-inner js-file-line">from sklearn.preprocessing import StandardScaler, MinMaxScaler</td>
</tr>
<tr>
<td id="file-incrementalpca-L4" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="4"></td>
<td id="file-incrementalpca-LC4" class="blob-code blob-code-inner js-file-line">from sklearn.decomposition import IncrementalPCA, PCA</td>
</tr>
<tr>
<td id="file-incrementalpca-L5" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="5"></td>
<td id="file-incrementalpca-LC5" class="blob-code blob-code-inner js-file-line">from matplotlib import pyplot as plt</td>
</tr>
<tr>
<td id="file-incrementalpca-L6" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="6"></td>
<td id="file-incrementalpca-LC6" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-incrementalpca-L7" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="7"></td>
<td id="file-incrementalpca-LC7" class="blob-code blob-code-inner js-file-line">chunksize = 10000</td>
</tr>
<tr>
<td id="file-incrementalpca-L8" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="8"></td>
<td id="file-incrementalpca-LC8" class="blob-code blob-code-inner js-file-line">numeric_drivers = [‘hour’,’banner_pos’,’C1′,’device_type’,’device_conn_type’,’C14′,’C15′,’C16′,’C17′,’C18′,’C19′,’C20′,’C21′]</td>
</tr>
<tr>
<td id="file-incrementalpca-L9" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="9"></td>
<td id="file-incrementalpca-LC9" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-incrementalpca-L10" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="10"></td>
<td id="file-incrementalpca-LC10" class="blob-code blob-code-inner js-file-line">df = pd.read_csv(‘../data/train.csv’,usecols=numeric_drivers,nrows=1000000)</td>
</tr>
<tr>
<td id="file-incrementalpca-L11" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="11"></td>
<td id="file-incrementalpca-LC11" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-incrementalpca-L12" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="12"></td>
<td id="file-incrementalpca-LC12" class="blob-code blob-code-inner js-file-line">#Perform scaling</td>
</tr>
<tr>
<td id="file-incrementalpca-L13" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="13"></td>
<td id="file-incrementalpca-LC13" class="blob-code blob-code-inner js-file-line">stdscaler = StandardScaler()</td>
</tr>
<tr>
<td id="file-incrementalpca-L14" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="14"></td>
<td id="file-incrementalpca-LC14" class="blob-code blob-code-inner js-file-line">df = stdscaler.fit_transform(df)</td>
</tr>
<tr>
<td id="file-incrementalpca-L15" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="15"></td>
<td id="file-incrementalpca-LC15" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-incrementalpca-L16" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="16"></td>
<td id="file-incrementalpca-LC16" class="blob-code blob-code-inner js-file-line">#fit the Incremental PCA</td>
</tr>
<tr>
<td id="file-incrementalpca-L17" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="17"></td>
<td id="file-incrementalpca-LC17" class="blob-code blob-code-inner js-file-line">pcainc = IncrementalPCA(n_components=len(numeric_drivers),batch_size=10000,copy=False,whiten=True)</td>
</tr>
<tr>
<td id="file-incrementalpca-L18" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="18"></td>
<td id="file-incrementalpca-LC18" class="blob-code blob-code-inner js-file-line">pcainc.fit(df)</td>
</tr>
<tr>
<td id="file-incrementalpca-L19" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="19"></td>
<td id="file-incrementalpca-LC19" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-incrementalpca-L20" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="20"></td>
<td id="file-incrementalpca-LC20" class="blob-code blob-code-inner js-file-line">#find components that are explaining more than 85% of the variance</td>
</tr>
<tr>
<td id="file-incrementalpca-L21" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="21"></td>
<td id="file-incrementalpca-LC21" class="blob-code blob-code-inner js-file-line">print(np.where(pcainc.explained_variance_ratio_.cumsum()<=0.85)[0].shape)</td>
</tr>
<tr>
<td id="file-incrementalpca-L22" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="22"></td>
<td id="file-incrementalpca-LC22" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-incrementalpca-L23" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="23"></td>
<td id="file-incrementalpca-LC23" class="blob-code blob-code-inner js-file-line">#plot the explained variance chart</td>
</tr>
<tr>
<td id="file-incrementalpca-L24" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="24"></td>
<td id="file-incrementalpca-LC24" class="blob-code blob-code-inner js-file-line">plt.plot(range(0,len(numeric_drivers)),pcainc.explained_variance_ratio_,’bx-‘)</td>
</tr>
<tr>
<td id="file-incrementalpca-L25" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="25"></td>
<td id="file-incrementalpca-LC25" class="blob-code blob-code-inner js-file-line">plt.xlabel(‘# Principal Components’)</td>
</tr>
<tr>
<td id="file-incrementalpca-L26" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="26"></td>
<td id="file-incrementalpca-LC26" class="blob-code blob-code-inner js-file-line">plt.ylabel(‘Explained Variance’)</td>
</tr>
<tr>
<td id="file-incrementalpca-L27" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="27"></td>
<td id="file-incrementalpca-LC27" class="blob-code blob-code-inner js-file-line">plt.title(‘Determine the # of PrinComps using explained variance’)</td>
</tr>
<tr>
<td id="file-incrementalpca-L28" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="28"></td>
<td id="file-incrementalpca-LC28" class="blob-code blob-code-inner js-file-line">plt.show()</td>
</tr>
<tr>
<td id="file-incrementalpca-L29" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="29"></td>
<td id="file-incrementalpca-LC29" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-incrementalpca-L30" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="30"></td>
<td id="file-incrementalpca-LC30" class="blob-code blob-code-inner js-file-line">#plot the cumulative explained variance chart</td>
</tr>
<tr>
<td id="file-incrementalpca-L31" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="31"></td>
<td id="file-incrementalpca-LC31" class="blob-code blob-code-inner js-file-line">plt.plot(range(0,len(numeric_drivers)),pcainc.explained_variance_ratio_.cumsum(),’bx-‘)</td>
</tr>
<tr>
<td id="file-incrementalpca-L32" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="32"></td>
<td id="file-incrementalpca-LC32" class="blob-code blob-code-inner js-file-line">plt.xlabel(‘# Principal Components’)</td>
</tr>
<tr>
<td id="file-incrementalpca-L33" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="33"></td>
<td id="file-incrementalpca-LC33" class="blob-code blob-code-inner js-file-line">plt.ylabel(‘Total Explained Variance’)</td>
</tr>
<tr>
<td id="file-incrementalpca-L34" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="34"></td>
<td id="file-incrementalpca-LC34" class="blob-code blob-code-inner js-file-line">plt.title(‘Determine the # of PrinComps using explained variance’)</td>
</tr>
<tr>
<td id="file-incrementalpca-L35" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="35"></td>
<td id="file-incrementalpca-LC35" class="blob-code blob-code-inner js-file-line">plt.show()</td>
</tr>
</tbody>
</table>
</div>
</div>
</figure>
<ul class="">
<li id="956b" class="ne nf iu jt b ju jv jy jz kc ng kg nh kk ni ko nj nk nl nm fw" data-selectable-paragraph=""><strong class="jt iv">AutoEncoders: </strong>Any neural network based method will naturally allow stochastic OR minibatch learning. It requires us to use the fit method (earlier known as the fit_generator, now overloaded into fit), by passing in a data generator instead of the actual data. In the next section, I’ve shown a sample — generic data generator, that could be used for any data, and also built a sample autoencoder that can compress the dimensions, similar to a PCA. More discussions on how an AutoEncoder compares to a PCA is outside the scope of this post — so probably another post on this..</li>
<li id="9d25" class="ne nf iu jt b ju nn jy no kc np kg nq kk nr ko nj nk nl nm fw" data-selectable-paragraph=""><strong class="jt iv">Incremental Matrix Factorization: </strong>Matrix Factorization is a popular method that is used not only in dimension reduction, but also in recommender systems, generating user/item embeddings, etc. The latent dimension that is generated out of the matrix factorization an be used as the reduced set of dimensions that represents the matrix as shown in the figure below. The typical set up requires us to load the entire response matrix to the RAM and perform the matrix decomposition which gets very expensive as the data and the sparsity increases. Alternatively, one can use a stochastic learning framework to learn the responses using any common algorithm like Alternating Least Squares, etc. to Factorize the Matrix, thus, being able to decompose any large matrix, with smaller RAM size. One such application is explained well in Incremental SGD (ISGD) — J. Vinagre, et al., 2014. Below is a small snippet that can perform such incremental sgd in a simple response matrix setup</li>
</ul>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi og"><picture><source srcset="https://miro.medium.com/max/640/1*Cpvz9-TTCJcp1siBkifcGA.png 640w, https://miro.medium.com/max/720/1*Cpvz9-TTCJcp1siBkifcGA.png 720w, https://miro.medium.com/max/750/1*Cpvz9-TTCJcp1siBkifcGA.png 750w, https://miro.medium.com/max/786/1*Cpvz9-TTCJcp1siBkifcGA.png 786w, https://miro.medium.com/max/828/1*Cpvz9-TTCJcp1siBkifcGA.png 828w, https://miro.medium.com/max/1100/1*Cpvz9-TTCJcp1siBkifcGA.png 1100w, https://miro.medium.com/max/1400/1*Cpvz9-TTCJcp1siBkifcGA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/1*Cpvz9-TTCJcp1siBkifcGA.png" alt="" width="700" height="232" /></picture></div>
</div>
</figure>
<figure class="kr ks kt ku gt kv">
<div class="m fo l do">
<div class="zl nt l">
<div class="gist-data">
<div class="js-gist-file-update-container js-task-list-container file-box">
<div id="file-isgd" class="file my-2">
<div class="Box-body p-0 blob-wrapper data type-text  ">
<div class="js-check-bidi js-blob-code-container blob-code-content">
<table class="highlight tab-size js-file-line-container js-code-nav-container js-tagsearch-file" data-hpc="" data-tab-size="8" data-paste-markdown-skip="" data-tagsearch-lang="" data-tagsearch-path="isgd">
<tbody>
<tr>
<td id="file-isgd-LC1" class="blob-code blob-code-inner js-file-line">import numpy as np</td>
</tr>
<tr>
<td id="file-isgd-L2" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="2"></td>
<td id="file-isgd-LC2" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-isgd-L3" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="3"></td>
<td id="file-isgd-LC3" class="blob-code blob-code-inner js-file-line">class incrementalSGD:</td>
</tr>
<tr>
<td id="file-isgd-L4" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="4"></td>
<td id="file-isgd-LC4" class="blob-code blob-code-inner js-file-line">def __init__(self, n_users, n_items, n_factors, alpha = 0.001, l2 = 0.01) -> None:</td>
</tr>
<tr>
<td id="file-isgd-L5" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="5"></td>
<td id="file-isgd-LC5" class="blob-code blob-code-inner js-file-line">self.n_users = n_users</td>
</tr>
<tr>
<td id="file-isgd-L6" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="6"></td>
<td id="file-isgd-LC6" class="blob-code blob-code-inner js-file-line">self.n_items = n_items</td>
</tr>
<tr>
<td id="file-isgd-L7" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="7"></td>
<td id="file-isgd-LC7" class="blob-code blob-code-inner js-file-line">self.n_factors = n_factors</td>
</tr>
<tr>
<td id="file-isgd-L8" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="8"></td>
<td id="file-isgd-LC8" class="blob-code blob-code-inner js-file-line">self.alpha = alpha</td>
</tr>
<tr>
<td id="file-isgd-L9" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="9"></td>
<td id="file-isgd-LC9" class="blob-code blob-code-inner js-file-line">self.l2 = l2</td>
</tr>
<tr>
<td id="file-isgd-L10" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="10"></td>
<td id="file-isgd-LC10" class="blob-code blob-code-inner js-file-line">self.user_latent = np.random.normal(0., 0.1, (self.n_user, self.n_factors))</td>
</tr>
<tr>
<td id="file-isgd-L11" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="11"></td>
<td id="file-isgd-LC11" class="blob-code blob-code-inner js-file-line">self.item_latent = np.random.normal(0., 0.1, (self.n_items, self.n_factors))</td>
</tr>
<tr>
<td id="file-isgd-L12" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="12"></td>
<td id="file-isgd-LC12" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-isgd-L13" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="13"></td>
<td id="file-isgd-LC13" class="blob-code blob-code-inner js-file-line">def factorize(self, user_index, item_index, response):</td>
</tr>
<tr>
<td id="file-isgd-L14" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="14"></td>
<td id="file-isgd-LC14" class="blob-code blob-code-inner js-file-line">user_vector = self.user_latent[user_index]</td>
</tr>
<tr>
<td id="file-isgd-L15" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="15"></td>
<td id="file-isgd-LC15" class="blob-code blob-code-inner js-file-line">item_vector = self.item_latent[item_index]</td>
</tr>
<tr>
<td id="file-isgd-L16" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="16"></td>
<td id="file-isgd-LC16" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-isgd-L17" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="17"></td>
<td id="file-isgd-LC17" class="blob-code blob-code-inner js-file-line">err = response – np.inner(user_vector, item_vector)</td>
</tr>
<tr>
<td id="file-isgd-L18" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="18"></td>
<td id="file-isgd-LC18" class="blob-code blob-code-inner js-file-line">self.user_latent[user_index] = user_vector + self.alpha * (err * item_vector – self.l2 * user_vector)</td>
</tr>
<tr>
<td id="file-isgd-L19" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="19"></td>
<td id="file-isgd-LC19" class="blob-code blob-code-inner js-file-line">self.item_latent[item_index] = item_vector + self.alpha * (err * user_vector – self.l2 * item_index)</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</div>
<div class="gist-meta"><a href="https://gist.github.com/meetpramodr/eda1d8aa3f31d44c4c205510d35ad378/raw/3ce211f2083b407232dc71f40a917811a0007bae/isgd">view raw</a></div>
</div>
</div>
</figure>
<p id="66d8" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Clustering:</strong></p>
<p id="6a53" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">A lot of partial fit based algorithms are made available that can incrementally learn the homogeneous groups of observations and add them to the clusters. Some of the examples include MiniBatchKMeans and Birch.</p>
<p id="63bd" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">MiniBatchKMeans</strong> randomly samples mini batches of observations during each iterations during the training. At the initial iterations, the centroids are created local to the sampled space, thus leading to major updates in the centroids as and when we sample different spaces over subsequent iterations. After enough number of iterations, this would converge to the true universal centroids. In order to decrease the major updates to the centroid at each iterations, owing to the random samples, one hack could be to update the centroids by taking an incremental average (there we go again!!) of the current centroid position with respect to all the previous mini batches.</p>
<p id="6d41" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Birch </strong>on the other hand, creates trees called as Clustering Feature Tree, which can be treated as a lossy compression. The leaf nodes can then be treated as centroids. For more details, I recommend this <a class="au kp" href="https://towardsdatascience.com/machine-learning-birch-clustering-algorithm-clearly-explained-fb9838cbeed9" target="_blank" rel="noopener">blog</a> by Cory Maklin which is quite intuitive to understand.</p>
<p id="8bf8" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The codes for partial_fit for MiniBatchKMeans and Birch are well documented in the scikit-learn site.</p>
<p id="9003" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Classification/Prediction:</strong></p>
<p id="f2ba" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Supervised learning can be performed either using Stochastic Gradient Descent’s partial fit or by using the incremental least squares (Code given below). The idea is similar, where we read mini batch observations and try adjusting the weights stochastically. Scikit-learn has a bunch of algorithms like SGDRegressor, SGDClassification, MultiNomialNB and BernoulliNB (Naive Bayes), PassiveAgressiveClassifier, Perceptron.</p>
<p id="cc2f" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">One can also tweak the LeastSquare algorithm to incrementally update the co-variance matrix, that can enable us to partially fit the data. Below is a code that I found on stackexchange, which does this —</p>
<figure class="kr ks kt ku gt kv">
<div class="m fo l do">
<div class="zj nt l"></div>
</div>
</figure>
<p data-selectable-paragraph="">
<table class="highlight tab-size js-file-line-container js-code-nav-container js-tagsearch-file" data-hpc="" data-tab-size="8" data-paste-markdown-skip="" data-tagsearch-lang="" data-tagsearch-path="leastsquare_partialfit">
<tbody>
<tr>
<td id="file-leastsquare_partialfit-LC1" class="blob-code blob-code-inner js-file-line">import pandas as pd</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L2" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="2"></td>
<td id="file-leastsquare_partialfit-LC2" class="blob-code blob-code-inner js-file-line">import numpy as np</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L3" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="3"></td>
<td id="file-leastsquare_partialfit-LC3" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L4" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="4"></td>
<td id="file-leastsquare_partialfit-LC4" class="blob-code blob-code-inner js-file-line">file_path = ‘../data/train.csv’</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L5" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="5"></td>
<td id="file-leastsquare_partialfit-LC5" class="blob-code blob-code-inner js-file-line">chunksize = 1024</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L6" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="6"></td>
<td id="file-leastsquare_partialfit-LC6" class="blob-code blob-code-inner js-file-line">X_vars = [‘hour’,’banner_pos’,’C1′,’device_type’,’device_conn_type’,’C14′,’C15′,’C16′,’C17′,’C18′,’C19′,’C20′,’C21′]</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L7" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="7"></td>
<td id="file-leastsquare_partialfit-LC7" class="blob-code blob-code-inner js-file-line">y_var = ‘click’</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L8" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="8"></td>
<td id="file-leastsquare_partialfit-LC8" class="blob-code blob-code-inner js-file-line">meanX = np.zeros((chunksize,len(X_vars)))</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L9" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="9"></td>
<td id="file-leastsquare_partialfit-LC9" class="blob-code blob-code-inner js-file-line">meanY = np.zeros(chunksize)</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L10" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="10"></td>
<td id="file-leastsquare_partialfit-LC10" class="blob-code blob-code-inner js-file-line">varX = 0</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L11" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="11"></td>
<td id="file-leastsquare_partialfit-LC11" class="blob-code blob-code-inner js-file-line">covXY = 0</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L12" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="12"></td>
<td id="file-leastsquare_partialfit-LC12" class="blob-code blob-code-inner js-file-line">meanXY = 0</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L13" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="13"></td>
<td id="file-leastsquare_partialfit-LC13" class="blob-code blob-code-inner js-file-line">varY = 0</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L14" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="14"></td>
<td id="file-leastsquare_partialfit-LC14" class="blob-code blob-code-inner js-file-line">alpha=0.001</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L15" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="15"></td>
<td id="file-leastsquare_partialfit-LC15" class="blob-code blob-code-inner js-file-line">betas = np.zeros(len(X_vars))</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L16" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="16"></td>
<td id="file-leastsquare_partialfit-LC16" class="blob-code blob-code-inner js-file-line">df = pd.read_csv(file_path,chunksize=chunksize)</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L17" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="17"></td>
<td id="file-leastsquare_partialfit-LC17" class="blob-code blob-code-inner js-file-line">c = 0</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L18" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="18"></td>
<td id="file-leastsquare_partialfit-LC18" class="blob-code blob-code-inner js-file-line">for chunk in df:</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L19" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="19"></td>
<td id="file-leastsquare_partialfit-LC19" class="blob-code blob-code-inner js-file-line">X = chunk.loc[:,X_vars]</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L20" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="20"></td>
<td id="file-leastsquare_partialfit-LC20" class="blob-code blob-code-inner js-file-line">y = chunk.loc[:,y_var]</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L21" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="21"></td>
<td id="file-leastsquare_partialfit-LC21" class="blob-code blob-code-inner js-file-line">dx = X – meanX</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L22" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="22"></td>
<td id="file-leastsquare_partialfit-LC22" class="blob-code blob-code-inner js-file-line">dy = y – meanY</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L23" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="23"></td>
<td id="file-leastsquare_partialfit-LC23" class="blob-code blob-code-inner js-file-line">dxy = (X*y) – meanXY</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L24" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="24"></td>
<td id="file-leastsquare_partialfit-LC24" class="blob-code blob-code-inner js-file-line">varX += ((1-alpha)*dx*dx – varX)*alpha</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L25" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="25"></td>
<td id="file-leastsquare_partialfit-LC25" class="blob-code blob-code-inner js-file-line">varY += ((1-alpha)*dy*dy – varY)*alpha</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L26" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="26"></td>
<td id="file-leastsquare_partialfit-LC26" class="blob-code blob-code-inner js-file-line">covXY += ((1-alpha)*dx*dy – covXY)*alpha</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L27" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="27"></td>
<td id="file-leastsquare_partialfit-LC27" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L28" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="28"></td>
<td id="file-leastsquare_partialfit-LC28" class="blob-code blob-code-inner js-file-line">meanX += dx * alpha</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L29" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="29"></td>
<td id="file-leastsquare_partialfit-LC29" class="blob-code blob-code-inner js-file-line">meanY += dy * alpha</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L30" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="30"></td>
<td id="file-leastsquare_partialfit-LC30" class="blob-code blob-code-inner js-file-line">meanXY += dxy * alpha</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L31" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="31"></td>
<td id="file-leastsquare_partialfit-LC31" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L32" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="32"></td>
<td id="file-leastsquare_partialfit-LC32" class="blob-code blob-code-inner js-file-line">betas = covXY/varX</td>
</tr>
<tr>
<td id="file-leastsquare_partialfit-L33" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="33"></td>
<td id="file-leastsquare_partialfit-LC33" class="blob-code blob-code-inner js-file-line">bias = meanY – np.dot(betas,meanX)</td>
</tr>
</tbody>
</table>
<p id="c282" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Tensorflow too, supports fitgenerator (now overloaded with fit function itself), where once can write a generator method to push the data in a streaming fashion for mini-batch stochastic learning. Tensorflow has a ready implemented <a class="au kp" href="https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator" target="_blank" rel="noopener ugc nofollow">image generator</a>. Below is an implementation of a sample custom fitgenerator.</p>
<figure class="kr ks kt ku gt kv">
<div class="m fo l do">
<div class="zm nt l">
<table class="highlight tab-size js-file-line-container js-code-nav-container js-tagsearch-file" data-hpc="" data-tab-size="8" data-paste-markdown-skip="" data-tagsearch-lang="" data-tagsearch-path="customdatagenerator">
<tbody>
<tr>
<td id="file-customdatagenerator-LC1" class="blob-code blob-code-inner js-file-line">import numpy as np</td>
</tr>
<tr>
<td id="file-customdatagenerator-L2" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="2"></td>
<td id="file-customdatagenerator-LC2" class="blob-code blob-code-inner js-file-line">def generator(X_data, y_data, batch_size):</td>
</tr>
<tr>
<td id="file-customdatagenerator-L3" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="3"></td>
<td id="file-customdatagenerator-LC3" class="blob-code blob-code-inner js-file-line">samples_per_epoch = X_data.shape[0]</td>
</tr>
<tr>
<td id="file-customdatagenerator-L4" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="4"></td>
<td id="file-customdatagenerator-LC4" class="blob-code blob-code-inner js-file-line">number_of_batches = samples_per_epoch/batch_size</td>
</tr>
<tr>
<td id="file-customdatagenerator-L5" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="5"></td>
<td id="file-customdatagenerator-LC5" class="blob-code blob-code-inner js-file-line">counter=0</td>
</tr>
<tr>
<td id="file-customdatagenerator-L6" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="6"></td>
<td id="file-customdatagenerator-LC6" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-customdatagenerator-L7" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="7"></td>
<td id="file-customdatagenerator-LC7" class="blob-code blob-code-inner js-file-line">while 1:</td>
</tr>
<tr>
<td id="file-customdatagenerator-L8" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="8"></td>
<td id="file-customdatagenerator-LC8" class="blob-code blob-code-inner js-file-line">indices = list(range(batch_size*counter,min(batch_size*(counter+1),samples_per_epoch)))</td>
</tr>
<tr>
<td id="file-customdatagenerator-L9" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="9"></td>
<td id="file-customdatagenerator-LC9" class="blob-code blob-code-inner js-file-line">np.random.shuffle(indices)</td>
</tr>
<tr>
<td id="file-customdatagenerator-L10" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="10"></td>
<td id="file-customdatagenerator-LC10" class="blob-code blob-code-inner js-file-line">X_batch = np.array(X_data[indices]).astype(‘float32’)</td>
</tr>
<tr>
<td id="file-customdatagenerator-L11" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="11"></td>
<td id="file-customdatagenerator-LC11" class="blob-code blob-code-inner js-file-line">y_batch = np.array(y_data[indices]).astype(‘int’)</td>
</tr>
<tr>
<td id="file-customdatagenerator-L12" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="12"></td>
<td id="file-customdatagenerator-LC12" class="blob-code blob-code-inner js-file-line">counter += 1</td>
</tr>
<tr>
<td id="file-customdatagenerator-L13" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="13"></td>
<td id="file-customdatagenerator-LC13" class="blob-code blob-code-inner js-file-line">yield X_batch,y_batch</td>
</tr>
<tr>
<td id="file-customdatagenerator-L14" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="14"></td>
<td id="file-customdatagenerator-LC14" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-customdatagenerator-L15" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="15"></td>
<td id="file-customdatagenerator-LC15" class="blob-code blob-code-inner js-file-line">#restart counter to yeild data in the next epoch as well</td>
</tr>
<tr>
<td id="file-customdatagenerator-L16" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="16"></td>
<td id="file-customdatagenerator-LC16" class="blob-code blob-code-inner js-file-line">if counter >= number_of_batches:</td>
</tr>
<tr>
<td id="file-customdatagenerator-L17" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="17"></td>
<td id="file-customdatagenerator-LC17" class="blob-code blob-code-inner js-file-line">counter = 0</td>
</tr>
<tr>
<td id="file-customdatagenerator-L18" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="18"></td>
<td id="file-customdatagenerator-LC18" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-customdatagenerator-L19" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="19"></td>
<td id="file-customdatagenerator-LC19" class="blob-code blob-code-inner js-file-line">batch_size = 512</td>
</tr>
<tr>
<td id="file-customdatagenerator-L20" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="20"></td>
<td id="file-customdatagenerator-LC20" class="blob-code blob-code-inner js-file-line"></td>
</tr>
<tr>
<td id="file-customdatagenerator-L21" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="21"></td>
<td id="file-customdatagenerator-LC21" class="blob-code blob-code-inner js-file-line">history = tfmodel.fit_generator(</td>
</tr>
<tr>
<td id="file-customdatagenerator-L22" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="22"></td>
<td id="file-customdatagenerator-LC22" class="blob-code blob-code-inner js-file-line">generator(trainX,trainY,batch_size),</td>
</tr>
<tr>
<td id="file-customdatagenerator-L23" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="23"></td>
<td id="file-customdatagenerator-LC23" class="blob-code blob-code-inner js-file-line">epochs=10,</td>
</tr>
<tr>
<td id="file-customdatagenerator-L24" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="24"></td>
<td id="file-customdatagenerator-LC24" class="blob-code blob-code-inner js-file-line">steps_per_epoch = trainX.shape[0]/batch_size,</td>
</tr>
<tr>
<td id="file-customdatagenerator-L25" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="25"></td>
<td id="file-customdatagenerator-LC25" class="blob-code blob-code-inner js-file-line">validation_data = generator([testX,testY,batch_size),</td>
</tr>
<tr>
<td id="file-customdatagenerator-L26" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="26"></td>
<td id="file-customdatagenerator-LC26" class="blob-code blob-code-inner js-file-line">validation_steps = testX.shape[0]/batch_size</td>
</tr>
<tr>
<td id="file-customdatagenerator-L27" class="blob-num js-line-number js-code-nav-line-number js-blob-rnum" data-line-number="27"></td>
<td id="file-customdatagenerator-LC27" class="blob-code blob-code-inner js-file-line">)</td>
</tr>
</tbody>
</table>
</div>
</div>
</figure>
<h1 id="9f50" class="mn lg iu bm lh mo mp mq ll mr ms mt lp mu mv mw ls mx my mz lv na nb nc ly nd fw" data-selectable-paragraph="">Conclusion</h1>
<p id="ade8" class="pw-post-body-paragraph jr js iu jt b ju ma jw jx jy mb ka kb kc mc ke kf kg md ki kj kk me km kn ko in fw" data-selectable-paragraph="">In summary, do not get bogged down by the size of the data. Bring on all the data in the world and with the right set of tools (both mathematical and computational) and the Attitude to solve them, we should be able to take on any monster to tackle.</p>
<p id="bb81" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Let me know your comments and thoughts.</p>
<p data-selectable-paragraph="">
<p data-selectable-paragraph="">Originally published on <a href="https://medium.com/falabellatechnology/build-models-on-massively-big-data-using-continuous-online-learning-c1707d4bb02c">Medium</a></p><p>The post <a href="https://falabellaindia.com/portfolio/build-models-on-massively-big-data-using-continuous-online-learning/">Build Models on Massively Big Data using Continuous/Online Learning</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Retail Product comparison Architecture using different AI/ML/NLP techniques</title>
		<link>https://falabellaindia.com/portfolio/retail-product-comparison-architecture-using-different-ai-ml-nlp-techniques/</link>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Thu, 20 Oct 2022 21:53:05 +0000</pubDate>
				<guid isPermaLink="false">https://falabellaindia.com/?post_type=portfolio&#038;p=2977</guid>

					<description><![CDATA[<p>There are different ways in which we can compare the products in market. But we ultimately end up in some challenges or less accuracy in reaching the target details. To overcome these challenges, we could go for cascading approaches which is one of the good approaches. Below approaches can be followed in a cascading approach [&#8230;]</p>
<p>The post <a href="https://falabellaindia.com/portfolio/retail-product-comparison-architecture-using-different-ai-ml-nlp-techniques/">Retail Product comparison Architecture using different AI/ML/NLP techniques</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></description>
										<content:encoded><![CDATA[<p id="a634" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">There are different ways in which we can compare the products in market. But we ultimately end up in some challenges or less accuracy in reaching the target details.</p>
<p id="ffa8" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">To overcome these challenges, we could go for cascading approaches which is one of the good approaches.</p>
<p id="ae27" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Below approaches can be followed in a cascading approach manner.</p>
<p id="0e91" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">1. Title Similarity detection of different products</p>
<p id="d624" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">2. Image Similarity detection of different products</p>
<p id="a22b" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">3. Attribute extraction/detection of different products</p>
<p id="e80a" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">4. Price comparison of different products</p>
<p id="86df" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Once the similarity is detected using the above approaches get the probability if the products are matching or not matching.</p>
<p id="67dd" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">TITLE SIMILARITY MODEL:</strong></p>
<figure class="kq kr ks kt gt ku gh gi paragraph-image">
<div class="kv kw do kx ce ky" tabindex="0" role="button">
<div class="gh gi kp"><picture><source srcset="https://miro.medium.com/max/640/1*TH_BVDK7HnqyTLYQJW5K-w.png 640w, https://miro.medium.com/max/720/1*TH_BVDK7HnqyTLYQJW5K-w.png 720w, https://miro.medium.com/max/750/1*TH_BVDK7HnqyTLYQJW5K-w.png 750w, https://miro.medium.com/max/786/1*TH_BVDK7HnqyTLYQJW5K-w.png 786w, https://miro.medium.com/max/828/1*TH_BVDK7HnqyTLYQJW5K-w.png 828w, https://miro.medium.com/max/1100/1*TH_BVDK7HnqyTLYQJW5K-w.png 1100w, https://miro.medium.com/max/1400/1*TH_BVDK7HnqyTLYQJW5K-w.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce kz la c" role="presentation" src="https://miro.medium.com/max/1400/1*TH_BVDK7HnqyTLYQJW5K-w.png" alt="" width="700" height="250" /></picture></div>
</div>
</figure>
<p id="a679" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">IMAGE SIMILARITY MODEL:</strong></p>
<figure class="kq kr ks kt gt ku gh gi paragraph-image">
<div class="kv kw do kx ce ky" tabindex="0" role="button">
<div class="gh gi lb"><picture><source srcset="https://miro.medium.com/max/640/1*JYoq1e5ivJmHoe1q37W2Yg.png 640w, https://miro.medium.com/max/720/1*JYoq1e5ivJmHoe1q37W2Yg.png 720w, https://miro.medium.com/max/750/1*JYoq1e5ivJmHoe1q37W2Yg.png 750w, https://miro.medium.com/max/786/1*JYoq1e5ivJmHoe1q37W2Yg.png 786w, https://miro.medium.com/max/828/1*JYoq1e5ivJmHoe1q37W2Yg.png 828w, https://miro.medium.com/max/1100/1*JYoq1e5ivJmHoe1q37W2Yg.png 1100w, https://miro.medium.com/max/1400/1*JYoq1e5ivJmHoe1q37W2Yg.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce kz la c" role="presentation" src="https://miro.medium.com/max/1400/1*JYoq1e5ivJmHoe1q37W2Yg.png" alt="" width="700" height="291" /></picture></div>
</div>
</figure>
<p id="8f75" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">DESCRIPTION/ATTRIBUTE EXTRACTION MODEL:</strong></p>
<figure class="kq kr ks kt gt ku gh gi paragraph-image">
<div class="kv kw do kx ce ky" tabindex="0" role="button">
<div class="gh gi lc"><picture><source srcset="https://miro.medium.com/max/640/1*ej35TE0SjFuEuZFAMw44nQ.png 640w, https://miro.medium.com/max/720/1*ej35TE0SjFuEuZFAMw44nQ.png 720w, https://miro.medium.com/max/750/1*ej35TE0SjFuEuZFAMw44nQ.png 750w, https://miro.medium.com/max/786/1*ej35TE0SjFuEuZFAMw44nQ.png 786w, https://miro.medium.com/max/828/1*ej35TE0SjFuEuZFAMw44nQ.png 828w, https://miro.medium.com/max/1100/1*ej35TE0SjFuEuZFAMw44nQ.png 1100w, https://miro.medium.com/max/1400/1*ej35TE0SjFuEuZFAMw44nQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce kz la c" role="presentation" src="https://miro.medium.com/max/1400/1*ej35TE0SjFuEuZFAMw44nQ.png" alt="" width="700" height="395" /></picture></div>
</div>
</figure>
<p id="85bf" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The above architecture can be used to compare the titles description but if we need to compare the features inside the product titles in different websites or link, we should go for NLP techniques to extract (using OCR) and then compare using Regex or NLP techniques. Price is also one of the feature that can be extracted and compared using the same technique.</p>
<p data-selectable-paragraph="">
<p data-selectable-paragraph="">Originally published on <a href="https://medium.com/falabellatechnology/retail-product-comparison-architecture-using-different-ai-ml-nlp-techniques-ff9ae0ced410">Medium</a></p><p>The post <a href="https://falabellaindia.com/portfolio/retail-product-comparison-architecture-using-different-ai-ml-nlp-techniques/">Retail Product comparison Architecture using different AI/ML/NLP techniques</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Reinforcement learning &#8211; Implementation using SARSA</title>
		<link>https://falabellaindia.com/portfolio/reinforcement-learning-implementation-using-sarsa/</link>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Thu, 20 Oct 2022 21:49:18 +0000</pubDate>
				<guid isPermaLink="false">https://falabellaindia.com/?post_type=portfolio&#038;p=2972</guid>

					<description><![CDATA[<p>Reinforcement learning — Step by Step Implementation using SARSA In this tutorial, I have given the step by step implementation of Reinforcement Learning (RL) using SARSA algorithm. Before jumping on to coding and functions, I have briefly explained the subject to easily understand the code and the steps that have been followed one after the [&#8230;]</p>
<p>The post <a href="https://falabellaindia.com/portfolio/reinforcement-learning-implementation-using-sarsa/">Reinforcement learning – Implementation using SARSA</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></description>
										<content:encoded><![CDATA[<article>
<div class="l">
<div class="l">
<section>
<div>
<div class="in io ip iq ir">
<p id="9974" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Reinforcement learning — Step by Step Implementation using SARSA</strong></p>
<p id="ff1f" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">In this tutorial, I have given the step by step implementation of Reinforcement Learning (RL) using SARSA algorithm. Before jumping on to coding and functions, I have briefly explained the subject to easily understand the code and the steps that have been followed one after the other. If this is your first time learning about RL-SARSA, I hope you will benefit from my tutorial below. I will also try to enhance this blog with the latest information as much as I can.</em></p>
<p id="b413" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Take an example of a driverless scooter taxi (which performs the work of food delivery person) taking the food parcel from one place to another place.</em></p>
<p id="535a" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Image:</em></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi kq"><picture><source srcset="https://miro.medium.com/max/640/0*Y_q6QGJEVRVM3xnR.png 640w, https://miro.medium.com/max/720/0*Y_q6QGJEVRVM3xnR.png 720w, https://miro.medium.com/max/750/0*Y_q6QGJEVRVM3xnR.png 750w, https://miro.medium.com/max/786/0*Y_q6QGJEVRVM3xnR.png 786w, https://miro.medium.com/max/828/0*Y_q6QGJEVRVM3xnR.png 828w, https://miro.medium.com/max/1100/0*Y_q6QGJEVRVM3xnR.png 1100w, https://miro.medium.com/max/1400/0*Y_q6QGJEVRVM3xnR.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/0*Y_q6QGJEVRVM3xnR.png" alt="" width="700" height="450" /></picture></div>
</div>
</figure>
<p id="d0fe" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Before proceeding further on implementing RL, we should know the following:</em></p>
<p id="42ec" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">The main processes of RL are:</em></strong></p>
<p id="4e45" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Observe, Decide, Act, receive, learn and Iterate</em></p>
<p id="53c9" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Observe means observing the environment of the agent</em></p>
<p id="8910" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Decide means decide as per the observation</em></p>
<p id="9860" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Acting means taking action on the decision</em></p>
<p id="daeb" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Receive means receive rewards or get penalized as per the action</em></p>
<p id="e5da" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Learn means to learn from previous actions and improve</em></p>
<p id="8512" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Iterate means to repeat the entire process until success or the goal has been reached</em></p>
<p id="a79c" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">The main components of RL are</em></strong></p>
<p id="6aa9" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Environment, Agent, Rewards, Goals and Actions, State, Policy</em></p>
<p id="a0fa" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">An agent isa learner and the decision-maker for eg: Taxi</p>
<p id="07cf" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">The environment </em></strong><em class="kp">is the place where the agent learns and decides what actions to perform.</em></p>
<p id="60ee" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Action is</em></strong><em class="kp"> a set of actions that the agent can perform.</em></p>
<p id="301d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">State</em></strong><em class="kp"> is the state of the agent in the environment.</em></p>
<p id="eaf3" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">A reward</em></strong><em class="kp"> is nothing but a scalar value. For each action selected by the agent, the environment provides a reward. It can be a positive or negative reward. It can be instant or long-term reward</em></p>
<p id="3c62" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">We need the above components to implement RL and to build Q table/Optimal policy</em></p>
<p id="8bc3" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">With respect above example with image, let’s arrive at these components and build Q table/Optimal Policy</em></p>
<p id="ba92" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">States:</em></strong></p>
<p id="aa7d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">In the above image, think that it is an area divided to 5*5 grid which means it has 25 possible states. Current location of the scooter or bike taxi is (2,3).</em></p>
<p id="4289" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">This is just for an example. In a real-time scenario, the no of possible states might be in millions also 🙂</em></p>
<p id="0bed" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Here Food — states can be R, G, Y, B and inside scooter also and hence 5 states.</em></p>
<p id="f48e" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Locations where we can pick the food /delivering the food are 4</em></p>
<p id="6341" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Hence Total number of possible states =</em></p>
<p id="cfc5" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">25 possible states * 5 Food states * 4 Locations for pick the food/deliver the food</em></p>
<p id="adf9" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">= 500 possible states</em></p>
<p id="4f27" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Actions:</em></strong></p>
<p id="86b3" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Actions could be the scooter moving up, down, right, left, picking the food, delivering the food. Hence totally 6 actions</em></p>
<p id="6f87" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Rewards:</em></strong></p>
<p id="5714" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">When it comes to the rewards part, </em>the scooter will receive a reward whenever it takes action.</p>
<p id="9bbf" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">It can be positive or negative.<em class="kp"> Rewards will be defined by the programmer and let’s define the</em></p>
<p id="df67" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">rewards for this example as below:</em></p>
<p id="f786" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Reward to pick up the food from the right place: 10</em></p>
<p id="fb62" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Reward to deliver the food in the right place: 20</em></p>
<p id="45e5" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">wrong moves = -1. This is called penalization when it does the wrong action. E.g.: Scooter has gone to the dead-end of the road, it will get a reward of -1.</em></p>
<p id="78fb" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Gym Package:</p>
<p id="7d3d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Gym is a toolkit which helps to develop and compare reinforcement learning algorithms. The gym library is a collection of test problems — environments — that anyone can use to test your reinforcement learning algorithms. These environments in the gym have a shared interface, which allows you to write general algorithms.</p>
<p id="696c" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">We have environments like FrozenLake-V0, Atari, 2D and 3D Robots, Taxi etc</em></p>
<p id="f942" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">The other env names can be found in the below link:</em></p>
<p id="32fe" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><a class="au lc" href="https://github.com/openai/gym/blob/master/gym/envs/" target="_blank" rel="noopener ugc nofollow"><em class="kp">https://github.com/openai/gym/blob/master/gym/envs/</em></a></p>
<p id="823d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">In OpenAI’s gym package: following functions can be used for implementing RL</em></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*NQU2CqUoc-efH5Y2 640w, https://miro.medium.com/max/720/0*NQU2CqUoc-efH5Y2 720w, https://miro.medium.com/max/750/0*NQU2CqUoc-efH5Y2 750w, https://miro.medium.com/max/786/0*NQU2CqUoc-efH5Y2 786w, https://miro.medium.com/max/828/0*NQU2CqUoc-efH5Y2 828w, https://miro.medium.com/max/1100/0*NQU2CqUoc-efH5Y2 1100w, https://miro.medium.com/max/120/0*NQU2CqUoc-efH5Y2 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*NQU2CqUoc-efH5Y2" alt="" width="60" height="22" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi le"><picture><source srcset="https://miro.medium.com/max/640/0*li2wj9teJWNE-UPS.png 640w, https://miro.medium.com/max/720/0*li2wj9teJWNE-UPS.png 720w, https://miro.medium.com/max/750/0*li2wj9teJWNE-UPS.png 750w, https://miro.medium.com/max/786/0*li2wj9teJWNE-UPS.png 786w, https://miro.medium.com/max/828/0*li2wj9teJWNE-UPS.png 828w, https://miro.medium.com/max/1100/0*li2wj9teJWNE-UPS.png 1100w, https://miro.medium.com/max/1400/0*li2wj9teJWNE-UPS.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/0*li2wj9teJWNE-UPS.png" alt="" width="700" height="268" /></picture></div>
</div>
</figure>
<p id="91c6" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">env.reset will return a random initial state and also resets the environment. We can also reset to a particular state.</p>
<p id="d390" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">env.step(action)will increase the time step in the environment.</p>
<p id="d4b7" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">env.render will be helpful in the visualization of the environment with agent location.</p>
<p id="e062" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Observation indicates the observation of the environment</p>
<p id="73e5" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">A reward is a reward achieved from the action done previously. It can be positive or negative</p>
<p id="45f7" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Done will tell if we have reached the goal or not. It will Be False if not reached else True.</p>
<p id="07fd" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Now let’s get in to step by step implementation:</em></strong></p>
<p id="b57d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Step1: #Import the following libraries</em></strong></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*Dhtz_60cHkn_R0OP 640w, https://miro.medium.com/max/720/0*Dhtz_60cHkn_R0OP 720w, https://miro.medium.com/max/750/0*Dhtz_60cHkn_R0OP 750w, https://miro.medium.com/max/786/0*Dhtz_60cHkn_R0OP 786w, https://miro.medium.com/max/828/0*Dhtz_60cHkn_R0OP 828w, https://miro.medium.com/max/1100/0*Dhtz_60cHkn_R0OP 1100w, https://miro.medium.com/max/120/0*Dhtz_60cHkn_R0OP 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*Dhtz_60cHkn_R0OP" alt="" width="60" height="25" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi lf"><picture><source srcset="https://miro.medium.com/max/640/0*T3XVj7LHD54nptGg.png 640w, https://miro.medium.com/max/720/0*T3XVj7LHD54nptGg.png 720w, https://miro.medium.com/max/750/0*T3XVj7LHD54nptGg.png 750w, https://miro.medium.com/max/786/0*T3XVj7LHD54nptGg.png 786w, https://miro.medium.com/max/828/0*T3XVj7LHD54nptGg.png 828w, https://miro.medium.com/max/1100/0*T3XVj7LHD54nptGg.png 1100w, https://miro.medium.com/max/564/0*T3XVj7LHD54nptGg.png 564w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 282px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/564/0*T3XVj7LHD54nptGg.png" alt="" width="282" height="122" /></picture></div>
</figure>
<p id="5f87" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">#Pickle file is mainly to store our Q table and optimal policy #information</em></p>
<p id="514f" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Step2:</em></strong><em class="kp"> #</em><strong class="jt iv"><em class="kp">Build /Create the environment</em></strong></p>
<p id="7617" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">(For now, we will use the existing environment in the gym )</em></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*Zfon5_qn9m-Fcukt 640w, https://miro.medium.com/max/720/0*Zfon5_qn9m-Fcukt 720w, https://miro.medium.com/max/750/0*Zfon5_qn9m-Fcukt 750w, https://miro.medium.com/max/786/0*Zfon5_qn9m-Fcukt 786w, https://miro.medium.com/max/828/0*Zfon5_qn9m-Fcukt 828w, https://miro.medium.com/max/1100/0*Zfon5_qn9m-Fcukt 1100w, https://miro.medium.com/max/120/0*Zfon5_qn9m-Fcukt 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*Zfon5_qn9m-Fcukt" alt="" width="60" height="6" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi lg"><picture><source srcset="https://miro.medium.com/max/640/0*CIRb2rWdkFOqlD_Z.png 640w, https://miro.medium.com/max/720/0*CIRb2rWdkFOqlD_Z.png 720w, https://miro.medium.com/max/750/0*CIRb2rWdkFOqlD_Z.png 750w, https://miro.medium.com/max/786/0*CIRb2rWdkFOqlD_Z.png 786w, https://miro.medium.com/max/828/0*CIRb2rWdkFOqlD_Z.png 828w, https://miro.medium.com/max/1100/0*CIRb2rWdkFOqlD_Z.png 1100w, https://miro.medium.com/max/670/0*CIRb2rWdkFOqlD_Z.png 670w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 335px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/670/0*CIRb2rWdkFOqlD_Z.png" alt="" width="335" height="37" /></picture></div>
</figure>
<p id="4e63" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">To check for other environments in the gym, you can sue the following code</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*qGBfci8QAAxOixp5 640w, https://miro.medium.com/max/720/0*qGBfci8QAAxOixp5 720w, https://miro.medium.com/max/750/0*qGBfci8QAAxOixp5 750w, https://miro.medium.com/max/786/0*qGBfci8QAAxOixp5 786w, https://miro.medium.com/max/828/0*qGBfci8QAAxOixp5 828w, https://miro.medium.com/max/1100/0*qGBfci8QAAxOixp5 1100w, https://miro.medium.com/max/120/0*qGBfci8QAAxOixp5 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*qGBfci8QAAxOixp5" alt="" width="60" height="18" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi lh"><picture><source srcset="https://miro.medium.com/max/640/0*r9CPu1OLWVMJLQA3.png 640w, https://miro.medium.com/max/720/0*r9CPu1OLWVMJLQA3.png 720w, https://miro.medium.com/max/750/0*r9CPu1OLWVMJLQA3.png 750w, https://miro.medium.com/max/786/0*r9CPu1OLWVMJLQA3.png 786w, https://miro.medium.com/max/828/0*r9CPu1OLWVMJLQA3.png 828w, https://miro.medium.com/max/1100/0*r9CPu1OLWVMJLQA3.png 1100w, https://miro.medium.com/max/792/0*r9CPu1OLWVMJLQA3.png 792w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 396px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/792/0*r9CPu1OLWVMJLQA3.png" alt="" width="396" height="125" /></picture></div>
</figure>
<p id="38ab" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Step3: Initialize all the parameters along with the Q table as below:</em></strong></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*I_EAvI9dF-BU4WOj 640w, https://miro.medium.com/max/720/0*I_EAvI9dF-BU4WOj 720w, https://miro.medium.com/max/750/0*I_EAvI9dF-BU4WOj 750w, https://miro.medium.com/max/786/0*I_EAvI9dF-BU4WOj 786w, https://miro.medium.com/max/828/0*I_EAvI9dF-BU4WOj 828w, https://miro.medium.com/max/1100/0*I_EAvI9dF-BU4WOj 1100w, https://miro.medium.com/max/120/0*I_EAvI9dF-BU4WOj 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*I_EAvI9dF-BU4WOj" alt="" width="60" height="37" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi li"><picture><source srcset="https://miro.medium.com/max/640/0*yW68IciKjRJ1tJ3X.png 640w, https://miro.medium.com/max/720/0*yW68IciKjRJ1tJ3X.png 720w, https://miro.medium.com/max/750/0*yW68IciKjRJ1tJ3X.png 750w, https://miro.medium.com/max/786/0*yW68IciKjRJ1tJ3X.png 786w, https://miro.medium.com/max/828/0*yW68IciKjRJ1tJ3X.png 828w, https://miro.medium.com/max/1100/0*yW68IciKjRJ1tJ3X.png 1100w, https://miro.medium.com/max/858/0*yW68IciKjRJ1tJ3X.png 858w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 429px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/858/0*yW68IciKjRJ1tJ3X.png" alt="" width="429" height="270" /></picture></div>
</figure>
<p id="7e2d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">#Q table initialization as below:</em></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*bH9cO4Gpot84h_oB 640w, https://miro.medium.com/max/720/0*bH9cO4Gpot84h_oB 720w, https://miro.medium.com/max/750/0*bH9cO4Gpot84h_oB 750w, https://miro.medium.com/max/786/0*bH9cO4Gpot84h_oB 786w, https://miro.medium.com/max/828/0*bH9cO4Gpot84h_oB 828w, https://miro.medium.com/max/1100/0*bH9cO4Gpot84h_oB 1100w, https://miro.medium.com/max/120/0*bH9cO4Gpot84h_oB 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*bH9cO4Gpot84h_oB" alt="" width="60" height="3" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi lj"><picture><source srcset="https://miro.medium.com/max/640/0*Sxvdpx87fzNs0pb-.png 640w, https://miro.medium.com/max/720/0*Sxvdpx87fzNs0pb-.png 720w, https://miro.medium.com/max/750/0*Sxvdpx87fzNs0pb-.png 750w, https://miro.medium.com/max/786/0*Sxvdpx87fzNs0pb-.png 786w, https://miro.medium.com/max/828/0*Sxvdpx87fzNs0pb-.png 828w, https://miro.medium.com/max/1100/0*Sxvdpx87fzNs0pb-.png 1100w, https://miro.medium.com/max/1400/0*Sxvdpx87fzNs0pb-.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/0*Sxvdpx87fzNs0pb-.png" alt="" width="700" height="39" /></picture></div>
</div>
</figure>
<p id="f6b1" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Initially, the values of the Q-table are initialized to 0. An action is chosen for a state. As we move, Q value is increased for the state-action whenever that action gives a good reward for the next state. If the action does not give a good reward for the next state, it is decreased.</em></p>
<p id="33ea" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Step 4:</strong></p>
<p id="e8dc" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">PART-A:</strong></p>
<p id="30af" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Now proceed to define the Epsilon greedy method and SARSA algorithm to build Q table</strong></p>
<p id="c67f" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">We have to build a table called Q-table which will have dimensions SxA where S is the number of states and A is the number of actions.</em></p>
<p id="c429" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">For Every S (State) there are A’s (actions), the likeliness of selecting or choosing a particular action depends on the values in Q table. Those values in the Q table are also known as State-Action values.</em></p>
<p id="29d9" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">SARSA depends on the current state, current action, reward obtained, next state and next action. SARSA stands for State Action Reward State Action which symbolizes the tuple (s, a, r, s’, a’). SARSA is an On Policy, a model-free method which uses the action performed by the current policy to learn the Q-value</em></p>
<p id="2b2a" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">#Write the below function to Choose the action based on Epsilon greedy method</strong></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*1KfPxmt_7BjT--Ao 640w, https://miro.medium.com/max/720/0*1KfPxmt_7BjT--Ao 720w, https://miro.medium.com/max/750/0*1KfPxmt_7BjT--Ao 750w, https://miro.medium.com/max/786/0*1KfPxmt_7BjT--Ao 786w, https://miro.medium.com/max/828/0*1KfPxmt_7BjT--Ao 828w, https://miro.medium.com/max/1100/0*1KfPxmt_7BjT--Ao 1100w, https://miro.medium.com/max/120/0*1KfPxmt_7BjT--Ao 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*1KfPxmt_7BjT--Ao" alt="" width="60" height="27" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi lk"><picture><source srcset="https://miro.medium.com/max/640/0*2gHeLzAqUVBrmHTv.png 640w, https://miro.medium.com/max/720/0*2gHeLzAqUVBrmHTv.png 720w, https://miro.medium.com/max/750/0*2gHeLzAqUVBrmHTv.png 750w, https://miro.medium.com/max/786/0*2gHeLzAqUVBrmHTv.png 786w, https://miro.medium.com/max/828/0*2gHeLzAqUVBrmHTv.png 828w, https://miro.medium.com/max/1100/0*2gHeLzAqUVBrmHTv.png 1100w, https://miro.medium.com/max/1318/0*2gHeLzAqUVBrmHTv.png 1318w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 659px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1318/0*2gHeLzAqUVBrmHTv.png" alt="" width="659" height="306" /></picture></div>
</figure>
<p id="7ae4" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">In the above function if the random number generated is less than 0.1, we can go for Exploration else we can go for Exploitation (Q learning).</p>
<p id="a2a3" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Exploration means we should try to get more details about the environment.</em></p>
<p id="18be" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Exploitation means we have to aim for maximizing the reward by making use of the information which is already found.</em></p>
<p id="c208" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">#Use below SARSA formula to learn Q value and update Q table:</em></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*1IaaZcrDcozjxjaY 640w, https://miro.medium.com/max/720/0*1IaaZcrDcozjxjaY 720w, https://miro.medium.com/max/750/0*1IaaZcrDcozjxjaY 750w, https://miro.medium.com/max/786/0*1IaaZcrDcozjxjaY 786w, https://miro.medium.com/max/828/0*1IaaZcrDcozjxjaY 828w, https://miro.medium.com/max/1100/0*1IaaZcrDcozjxjaY 1100w, https://miro.medium.com/max/120/0*1IaaZcrDcozjxjaY 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*1IaaZcrDcozjxjaY" alt="" width="60" height="6" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi le"><picture><source srcset="https://miro.medium.com/max/640/0*GyGAx2_fw1QEEoXO.png 640w, https://miro.medium.com/max/720/0*GyGAx2_fw1QEEoXO.png 720w, https://miro.medium.com/max/750/0*GyGAx2_fw1QEEoXO.png 750w, https://miro.medium.com/max/786/0*GyGAx2_fw1QEEoXO.png 786w, https://miro.medium.com/max/828/0*GyGAx2_fw1QEEoXO.png 828w, https://miro.medium.com/max/1100/0*GyGAx2_fw1QEEoXO.png 1100w, https://miro.medium.com/max/1400/0*GyGAx2_fw1QEEoXO.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/0*GyGAx2_fw1QEEoXO.png" alt="" width="700" height="77" /></picture></div>
</div>
</figure>
<p id="e18d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">In the above formula, we have the current station and action, Next state and next action. We also have alpha which is the learning rate and gamma as the discount rate.</em></p>
<p id="efde" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">PART-B:</em></strong></p>
<p id="ec94" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">#Write a Function as below to create a Q table. This will include SARSA formula</em></strong></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*7iRShbCN43rEOlEo 640w, https://miro.medium.com/max/720/0*7iRShbCN43rEOlEo 720w, https://miro.medium.com/max/750/0*7iRShbCN43rEOlEo 750w, https://miro.medium.com/max/786/0*7iRShbCN43rEOlEo 786w, https://miro.medium.com/max/828/0*7iRShbCN43rEOlEo 828w, https://miro.medium.com/max/1100/0*7iRShbCN43rEOlEo 1100w, https://miro.medium.com/max/120/0*7iRShbCN43rEOlEo 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*7iRShbCN43rEOlEo" alt="" width="60" height="13" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi le"><picture><source srcset="https://miro.medium.com/max/640/0*_vah3d7ukiqGZd6c.png 640w, https://miro.medium.com/max/720/0*_vah3d7ukiqGZd6c.png 720w, https://miro.medium.com/max/750/0*_vah3d7ukiqGZd6c.png 750w, https://miro.medium.com/max/786/0*_vah3d7ukiqGZd6c.png 786w, https://miro.medium.com/max/828/0*_vah3d7ukiqGZd6c.png 828w, https://miro.medium.com/max/1100/0*_vah3d7ukiqGZd6c.png 1100w, https://miro.medium.com/max/1400/0*_vah3d7ukiqGZd6c.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/0*_vah3d7ukiqGZd6c.png" alt="" width="700" height="156" /></picture></div>
</div>
</figure>
<p id="1422" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Step 5: Do the training using epsilon greedy method and SARSA algorithm.</em></strong></p>
<p id="d442" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Here the hyperparameters are your total_no_episodes.</em></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*7GjPElhfBEOFlJCF 640w, https://miro.medium.com/max/720/0*7GjPElhfBEOFlJCF 720w, https://miro.medium.com/max/750/0*7GjPElhfBEOFlJCF 750w, https://miro.medium.com/max/786/0*7GjPElhfBEOFlJCF 786w, https://miro.medium.com/max/828/0*7GjPElhfBEOFlJCF 828w, https://miro.medium.com/max/1100/0*7GjPElhfBEOFlJCF 1100w, https://miro.medium.com/max/120/0*7GjPElhfBEOFlJCF 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*7GjPElhfBEOFlJCF" alt="" width="60" height="27" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi le"><picture><source srcset="https://miro.medium.com/max/640/0*hNSilhvxxuMiMTID.png 640w, https://miro.medium.com/max/720/0*hNSilhvxxuMiMTID.png 720w, https://miro.medium.com/max/750/0*hNSilhvxxuMiMTID.png 750w, https://miro.medium.com/max/786/0*hNSilhvxxuMiMTID.png 786w, https://miro.medium.com/max/828/0*hNSilhvxxuMiMTID.png 828w, https://miro.medium.com/max/1100/0*hNSilhvxxuMiMTID.png 1100w, https://miro.medium.com/max/1400/0*hNSilhvxxuMiMTID.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/0*hNSilhvxxuMiMTID.png" alt="" width="700" height="324" /></picture></div>
</div>
</figure>
<p id="1766" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Step 6: #Evaluate your SARSA model by using the reward and total number of episodes information</em></strong></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*ozAWq3TWIhVfav7W 640w, https://miro.medium.com/max/720/0*ozAWq3TWIhVfav7W 720w, https://miro.medium.com/max/750/0*ozAWq3TWIhVfav7W 750w, https://miro.medium.com/max/786/0*ozAWq3TWIhVfav7W 786w, https://miro.medium.com/max/828/0*ozAWq3TWIhVfav7W 828w, https://miro.medium.com/max/1100/0*ozAWq3TWIhVfav7W 1100w, https://miro.medium.com/max/120/0*ozAWq3TWIhVfav7W 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*ozAWq3TWIhVfav7W" alt="" width="60" height="11" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ll"><picture><source srcset="https://miro.medium.com/max/640/0*n1x1y9YmplhZHweY.png 640w, https://miro.medium.com/max/720/0*n1x1y9YmplhZHweY.png 720w, https://miro.medium.com/max/750/0*n1x1y9YmplhZHweY.png 750w, https://miro.medium.com/max/786/0*n1x1y9YmplhZHweY.png 786w, https://miro.medium.com/max/828/0*n1x1y9YmplhZHweY.png 828w, https://miro.medium.com/max/1100/0*n1x1y9YmplhZHweY.png 1100w, https://miro.medium.com/max/1152/0*n1x1y9YmplhZHweY.png 1152w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 576px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1152/0*n1x1y9YmplhZHweY.png" alt="" width="576" height="111" /></picture></div>
</figure>
<p id="17d3" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv"><em class="kp">Step 7: #Get the Q table and visualize the matrix</em></strong></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*fKbWvhOpzyA9wDJN 640w, https://miro.medium.com/max/720/0*fKbWvhOpzyA9wDJN 720w, https://miro.medium.com/max/750/0*fKbWvhOpzyA9wDJN 750w, https://miro.medium.com/max/786/0*fKbWvhOpzyA9wDJN 786w, https://miro.medium.com/max/828/0*fKbWvhOpzyA9wDJN 828w, https://miro.medium.com/max/1100/0*fKbWvhOpzyA9wDJN 1100w, https://miro.medium.com/max/120/0*fKbWvhOpzyA9wDJN 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*fKbWvhOpzyA9wDJN" alt="" width="60" height="6" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi lm"><picture><source srcset="https://miro.medium.com/max/640/0*tcSgO604QrWSbuSV.png 640w, https://miro.medium.com/max/720/0*tcSgO604QrWSbuSV.png 720w, https://miro.medium.com/max/750/0*tcSgO604QrWSbuSV.png 750w, https://miro.medium.com/max/786/0*tcSgO604QrWSbuSV.png 786w, https://miro.medium.com/max/828/0*tcSgO604QrWSbuSV.png 828w, https://miro.medium.com/max/1100/0*tcSgO604QrWSbuSV.png 1100w, https://miro.medium.com/max/622/0*tcSgO604QrWSbuSV.png 622w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 311px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/622/0*tcSgO604QrWSbuSV.png" alt="" width="311" height="33" /></picture></div>
</figure>
<p id="7ba5" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Sample Q with dimensions (SxA -> 500*6) can be as below:</em></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*0vbm0hnYXqhTDsQ6 640w, https://miro.medium.com/max/720/0*0vbm0hnYXqhTDsQ6 720w, https://miro.medium.com/max/750/0*0vbm0hnYXqhTDsQ6 750w, https://miro.medium.com/max/786/0*0vbm0hnYXqhTDsQ6 786w, https://miro.medium.com/max/828/0*0vbm0hnYXqhTDsQ6 828w, https://miro.medium.com/max/1100/0*0vbm0hnYXqhTDsQ6 1100w, https://miro.medium.com/max/120/0*0vbm0hnYXqhTDsQ6 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*0vbm0hnYXqhTDsQ6" alt="" width="60" height="15" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi le"><picture><source srcset="https://miro.medium.com/max/640/0*mZF-dSyIxNGkGOfC.png 640w, https://miro.medium.com/max/720/0*mZF-dSyIxNGkGOfC.png 720w, https://miro.medium.com/max/750/0*mZF-dSyIxNGkGOfC.png 750w, https://miro.medium.com/max/786/0*mZF-dSyIxNGkGOfC.png 786w, https://miro.medium.com/max/828/0*mZF-dSyIxNGkGOfC.png 828w, https://miro.medium.com/max/1100/0*mZF-dSyIxNGkGOfC.png 1100w, https://miro.medium.com/max/1400/0*mZF-dSyIxNGkGOfC.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/0*mZF-dSyIxNGkGOfC.png" alt="" width="700" height="175" /></picture></div>
</div>
</figure>
<p id="a976" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">### Note the above Q table is just a sample</p>
<p id="cf92" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Step 8: Randomly select the state and see how many steps it has taken to reach the goal using the Q table that is already created with the above steps</strong></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*JDXZvajrP8HyxWGH 640w, https://miro.medium.com/max/720/0*JDXZvajrP8HyxWGH 720w, https://miro.medium.com/max/750/0*JDXZvajrP8HyxWGH 750w, https://miro.medium.com/max/786/0*JDXZvajrP8HyxWGH 786w, https://miro.medium.com/max/828/0*JDXZvajrP8HyxWGH 828w, https://miro.medium.com/max/1100/0*JDXZvajrP8HyxWGH 1100w, https://miro.medium.com/max/120/0*JDXZvajrP8HyxWGH 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*JDXZvajrP8HyxWGH" alt="" width="60" height="17" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="kw kx do ky ce kz" tabindex="0" role="button">
<div class="gh gi le"><picture><source srcset="https://miro.medium.com/max/640/0*MAAcfJxfdsA-6aBH.png 640w, https://miro.medium.com/max/720/0*MAAcfJxfdsA-6aBH.png 720w, https://miro.medium.com/max/750/0*MAAcfJxfdsA-6aBH.png 750w, https://miro.medium.com/max/786/0*MAAcfJxfdsA-6aBH.png 786w, https://miro.medium.com/max/828/0*MAAcfJxfdsA-6aBH.png 828w, https://miro.medium.com/max/1100/0*MAAcfJxfdsA-6aBH.png 1100w, https://miro.medium.com/max/1400/0*MAAcfJxfdsA-6aBH.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1400/0*MAAcfJxfdsA-6aBH.png" alt="" width="700" height="203" /></picture></div>
</div>
</figure>
<p id="04a0" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""># sample visualizations of first 2 steps and last few outputs are as below (performed with env.render())</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ln"><picture><source srcset="https://miro.medium.com/max/640/0*ey4y2yBwCgJe1QJf 640w, https://miro.medium.com/max/720/0*ey4y2yBwCgJe1QJf 720w, https://miro.medium.com/max/750/0*ey4y2yBwCgJe1QJf 750w, https://miro.medium.com/max/786/0*ey4y2yBwCgJe1QJf 786w, https://miro.medium.com/max/828/0*ey4y2yBwCgJe1QJf 828w, https://miro.medium.com/max/1100/0*ey4y2yBwCgJe1QJf 1100w, https://miro.medium.com/max/52/0*ey4y2yBwCgJe1QJf 52w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 26px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/52/0*ey4y2yBwCgJe1QJf" alt="" width="26" height="61" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi lo"><picture><source srcset="https://miro.medium.com/max/640/0*8pyJgsJkUlvjyNpb.png 640w, https://miro.medium.com/max/720/0*8pyJgsJkUlvjyNpb.png 720w, https://miro.medium.com/max/750/0*8pyJgsJkUlvjyNpb.png 750w, https://miro.medium.com/max/786/0*8pyJgsJkUlvjyNpb.png 786w, https://miro.medium.com/max/828/0*8pyJgsJkUlvjyNpb.png 828w, https://miro.medium.com/max/1100/0*8pyJgsJkUlvjyNpb.png 1100w, https://miro.medium.com/max/558/0*8pyJgsJkUlvjyNpb.png 558w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 279px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/558/0*8pyJgsJkUlvjyNpb.png" alt="" width="279" height="657" /></picture></div>
</figure>
<p id="0217" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">…… …. Other steps with visualization will be displayed …</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi lp"><picture><source srcset="https://miro.medium.com/max/640/0*pobReQ6auhDPaD5T 640w, https://miro.medium.com/max/720/0*pobReQ6auhDPaD5T 720w, https://miro.medium.com/max/750/0*pobReQ6auhDPaD5T 750w, https://miro.medium.com/max/786/0*pobReQ6auhDPaD5T 786w, https://miro.medium.com/max/828/0*pobReQ6auhDPaD5T 828w, https://miro.medium.com/max/1100/0*pobReQ6auhDPaD5T 1100w, https://miro.medium.com/max/56/0*pobReQ6auhDPaD5T 56w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 28px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/56/0*pobReQ6auhDPaD5T" alt="" width="28" height="60" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi lq"><picture><source srcset="https://miro.medium.com/max/640/0*hYU-2tAbHhQL-zyw.png 640w, https://miro.medium.com/max/720/0*hYU-2tAbHhQL-zyw.png 720w, https://miro.medium.com/max/750/0*hYU-2tAbHhQL-zyw.png 750w, https://miro.medium.com/max/786/0*hYU-2tAbHhQL-zyw.png 786w, https://miro.medium.com/max/828/0*hYU-2tAbHhQL-zyw.png 828w, https://miro.medium.com/max/1100/0*hYU-2tAbHhQL-zyw.png 1100w, https://miro.medium.com/max/672/0*hYU-2tAbHhQL-zyw.png 672w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 336px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/672/0*hYU-2tAbHhQL-zyw.png" alt="" width="336" height="731" /></picture></div>
</figure>
<p id="f64a" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">In the above outputs, the highlighted marks determine the current position of the scooter taxi in the environment while the direction given in brackets gives the direction of movement that the agent (scooter) will make next.</p>
<p id="82b9" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Finally, the food is delivered/ dropped in the right place after 14 steps in the above case</p>
<p id="2b18" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Step 9: Load the Q table in a pickle file for later use and unload the same</strong></p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ld"><picture><source srcset="https://miro.medium.com/max/640/0*8KAH_mYUpaADDkZ- 640w, https://miro.medium.com/max/720/0*8KAH_mYUpaADDkZ- 720w, https://miro.medium.com/max/750/0*8KAH_mYUpaADDkZ- 750w, https://miro.medium.com/max/786/0*8KAH_mYUpaADDkZ- 786w, https://miro.medium.com/max/828/0*8KAH_mYUpaADDkZ- 828w, https://miro.medium.com/max/1100/0*8KAH_mYUpaADDkZ- 1100w, https://miro.medium.com/max/120/0*8KAH_mYUpaADDkZ- 120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 60px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/120/0*8KAH_mYUpaADDkZ-" alt="" width="60" height="20" /></picture></div>
</figure>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi lr"><picture><source srcset="https://miro.medium.com/max/640/0*rif8q12ye8TXAT0G.png 640w, https://miro.medium.com/max/720/0*rif8q12ye8TXAT0G.png 720w, https://miro.medium.com/max/750/0*rif8q12ye8TXAT0G.png 750w, https://miro.medium.com/max/786/0*rif8q12ye8TXAT0G.png 786w, https://miro.medium.com/max/828/0*rif8q12ye8TXAT0G.png 828w, https://miro.medium.com/max/1100/0*rif8q12ye8TXAT0G.png 1100w, https://miro.medium.com/max/1220/0*rif8q12ye8TXAT0G.png 1220w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 610px" data-testid="og" /><img loading="lazy" decoding="async" class="ce la lb c" role="presentation" src="https://miro.medium.com/max/1220/0*rif8q12ye8TXAT0G.png" alt="" width="610" height="204" /></picture></div>
</figure>
<p id="2092" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Referred links:</strong></p>
<p id="f89c" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><a class="au lc" href="https://github.com/openai/gym/" target="_blank" rel="noopener ugc nofollow">https://github.com/openai/gym/</a></p>
<p id="3bf9" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><a class="au lc" href="https://www.geeksforgeeks.org/" target="_blank" rel="noopener ugc nofollow">https://www.geeksforgeeks.org/</a></p>
</div>
<p data-selectable-paragraph="">Originally published on <a href="https://medium.com/falabellatechnology/reinforcement-learning-implementation-using-sarsa-6506f5c66b01">Medium</a></p>
</div>
</section>
</div>
</div>
</article><p>The post <a href="https://falabellaindia.com/portfolio/reinforcement-learning-implementation-using-sarsa/">Reinforcement learning – Implementation using SARSA</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Test strategy and Test plan</title>
		<link>https://falabellaindia.com/portfolio/test-strategy-and-test-plan/</link>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Thu, 20 Oct 2022 21:33:12 +0000</pubDate>
				<guid isPermaLink="false">https://falabellaindia.com/?post_type=portfolio&#038;p=2961</guid>

					<description><![CDATA[<p>Determining the quality of a product or an application is one of the most important phases in STLC. This requires R &#038; D on finalising the tools, resources, infra involved. The documentation post this initial R &#038; D lays the foundation of quality assurance. One such document is the test strategy document which is always [&#8230;]</p>
<p>The post <a href="https://falabellaindia.com/portfolio/test-strategy-and-test-plan/">Test strategy and Test plan</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></description>
										<content:encoded><![CDATA[<p id="26aa" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Determining the quality of a product or an application is one of the most important phases in STLC. This requires R & D on finalising the tools, resources, infra involved. The documentation post this initial R & D lays the foundation of quality assurance.</p>
<p id="2f8e" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">One such document is the test strategy document which is always confused with the test plan document.</p>
<p id="def1" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Let’s clear out this confusion first.</p>
<p id="ae62" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Test strategy document is a one-time layout stating the testing approach for all the releases while</p>
<p id="ffa0" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Test plan document needs to be prepared for every release.</p>
<p id="3213" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Now let us look at each document individually to understand the basic attributes required.</p>
<p id="ba65" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Test strategy document</strong></p>
<p id="142c" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Template of the test strategy can have the below:</p>
<p id="f261" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">· document name, version, details of author, reviewer, approver,</em></p>
<p id="331e" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">· technology stack,</em></p>
<p id="9c77" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">· environment,</em></p>
<p id="52ac" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">· types of testing,</em></p>
<p id="0039" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">· tools used for each type of testing,</em></p>
<p id="ab73" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">· resource responsibilities,</em></p>
<p id="4f78" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">· risks and mitigations</em></p>
<p id="59b2" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Let’s get into the details of each parameter. The name of the document should call out the project name.</p>
<p id="7daa" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Format example:TestStrategy_ProjectName_version.</p>
<p id="c768" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Provide the below information for audit purpose:</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi kq"><picture><source srcset="https://miro.medium.com/max/640/1*B6rR3KfiTAw0KjbNwY-7Sg.png 640w, https://miro.medium.com/max/720/1*B6rR3KfiTAw0KjbNwY-7Sg.png 720w, https://miro.medium.com/max/750/1*B6rR3KfiTAw0KjbNwY-7Sg.png 750w, https://miro.medium.com/max/786/1*B6rR3KfiTAw0KjbNwY-7Sg.png 786w, https://miro.medium.com/max/828/1*B6rR3KfiTAw0KjbNwY-7Sg.png 828w, https://miro.medium.com/max/1100/1*B6rR3KfiTAw0KjbNwY-7Sg.png 1100w, https://miro.medium.com/max/1236/1*B6rR3KfiTAw0KjbNwY-7Sg.png 1236w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 618px" data-testid="og" /><img loading="lazy" decoding="async" class="ce kw kx c" role="presentation" src="https://miro.medium.com/max/1236/1*B6rR3KfiTAw0KjbNwY-7Sg.png" alt="" width="618" height="200" /></picture></div>
</figure>
<p id="c205" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">It is important to know the software and hardware specifications of the application in order to predict the best tools that can be used for manual, automation and performance tests. Define the technology used and all the tools involved clearly.</p>
<p id="a783" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Specify the environments that will be used in each stage of testing. For example, it is a common practise to have one lower environment for testing both functional and regression and another production like environment to perform a quick test of sanity along with the priority 1 scenarios identified from the current release scope.</p>
<p id="ea41" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Finalise the types of testing that will be carried out. This will differ from application to application. For example, an application driven by APIs and events will require, manual exploratory testing, automation testing along with basic security and performance testing.</p>
<p id="1542" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Once the types of testing are finalised it’s time to identity the tools that will be used for each of them. For the above scenario, it is suitable to use postman for manual exploratory testing, rest-assured with testNG for automation testing and Gatling for performance testing.</p>
<p id="2e09" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Pick your tools depending on the make of your application.</p>
<p id="d589" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Now identify the number of resources that will be required for each type of testing. State their responsibilities. Some projects have different teams for manual, automation and performance testing while others have same team for manual + automation and different for performance. Decide this structure and number based on the billing involved, skillset and timelines.</p>
<p id="c6ec" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">At last, foresee any anticipated risks and come up with their mitigation plans. There is never a plan without risks. It is the way we handle them that describes the success.</p>
<p id="0a46" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The test strategy once reviewed and approved will remain same throughout the project tenure.</p>
<p id="2fe3" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><strong class="jt iv">Test plan document</strong></p>
<p id="7a7b" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">On the other hand, a test plan has to be designed from the scratch for every release of the project. Template of the test plan must state clearly the below:</p>
<p id="5191" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· document name, version, details of author, reviewer, approver,</p>
<p id="58a0" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· in scope functionalities,</p>
<p id="d0da" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· out of scope functionalities,</p>
<p id="f83f" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· risks and mitigations,</p>
<p id="714e" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· entry criteria,</p>
<p id="c2ad" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· exit criteria.</p>
<p id="53b1" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">How to fill up each of these sections?</p>
<p id="4fb2" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Name of the document must be easy and focussed on identifying the specific release. Format example: TestPlan_ReleaseName_version.</p>
<p id="9beb" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">It is a good practise to record the below information for audit purpose:</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi ky"><picture><source srcset="https://miro.medium.com/max/640/1*F5yswknQ-D9RfVSdZpL05A.png 640w, https://miro.medium.com/max/720/1*F5yswknQ-D9RfVSdZpL05A.png 720w, https://miro.medium.com/max/750/1*F5yswknQ-D9RfVSdZpL05A.png 750w, https://miro.medium.com/max/786/1*F5yswknQ-D9RfVSdZpL05A.png 786w, https://miro.medium.com/max/828/1*F5yswknQ-D9RfVSdZpL05A.png 828w, https://miro.medium.com/max/1100/1*F5yswknQ-D9RfVSdZpL05A.png 1100w, https://miro.medium.com/max/1172/1*F5yswknQ-D9RfVSdZpL05A.png 1172w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 586px" data-testid="og" /><img loading="lazy" decoding="async" class="ce kw kx c" role="presentation" src="https://miro.medium.com/max/1172/1*F5yswknQ-D9RfVSdZpL05A.png" alt="" width="586" height="188" /></picture></div>
</figure>
<p id="7cb2" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">The next page must be about the user stories committed for the release. Place them in the “In scope” section. Listing the user stories with their description will suffice.</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi kz"><picture><source srcset="https://miro.medium.com/max/640/1*9_DVRSL82MMk6pTi0JDzeQ.png 640w, https://miro.medium.com/max/720/1*9_DVRSL82MMk6pTi0JDzeQ.png 720w, https://miro.medium.com/max/750/1*9_DVRSL82MMk6pTi0JDzeQ.png 750w, https://miro.medium.com/max/786/1*9_DVRSL82MMk6pTi0JDzeQ.png 786w, https://miro.medium.com/max/828/1*9_DVRSL82MMk6pTi0JDzeQ.png 828w, https://miro.medium.com/max/1100/1*9_DVRSL82MMk6pTi0JDzeQ.png 1100w, https://miro.medium.com/max/1120/1*9_DVRSL82MMk6pTi0JDzeQ.png 1120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 560px" data-testid="og" /><img loading="lazy" decoding="async" class="ce kw kx c" role="presentation" src="https://miro.medium.com/max/1120/1*9_DVRSL82MMk6pTi0JDzeQ.png" alt="" width="560" height="96" /></picture></div>
</figure>
<p id="d864" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">In case due to any reason a user story was moved out of the release mention it in the “Out of scope” section.</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="gh gi kz"><picture><source srcset="https://miro.medium.com/max/640/1*9_DVRSL82MMk6pTi0JDzeQ.png 640w, https://miro.medium.com/max/720/1*9_DVRSL82MMk6pTi0JDzeQ.png 720w, https://miro.medium.com/max/750/1*9_DVRSL82MMk6pTi0JDzeQ.png 750w, https://miro.medium.com/max/786/1*9_DVRSL82MMk6pTi0JDzeQ.png 786w, https://miro.medium.com/max/828/1*9_DVRSL82MMk6pTi0JDzeQ.png 828w, https://miro.medium.com/max/1100/1*9_DVRSL82MMk6pTi0JDzeQ.png 1100w, https://miro.medium.com/max/1120/1*9_DVRSL82MMk6pTi0JDzeQ.png 1120w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 560px" data-testid="og" /><img loading="lazy" decoding="async" class="ce kw kx c" role="presentation" src="https://miro.medium.com/max/1120/1*9_DVRSL82MMk6pTi0JDzeQ.png" alt="" width="560" height="96" /></picture></div>
</figure>
<p id="19ab" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Every testing phase will have a set of defined activities. Timelines section is very important to know the start date, end date of every activity to monitor the deviation if any.</p>
<figure class="kr ks kt ku gt kv gh gi paragraph-image">
<div class="lb lc do ld ce le" tabindex="0" role="button">
<div class="gh gi la"><picture><source srcset="https://miro.medium.com/max/640/1*16oUsGj2yN8wxG1U0GrpqA.png 640w, https://miro.medium.com/max/720/1*16oUsGj2yN8wxG1U0GrpqA.png 720w, https://miro.medium.com/max/750/1*16oUsGj2yN8wxG1U0GrpqA.png 750w, https://miro.medium.com/max/786/1*16oUsGj2yN8wxG1U0GrpqA.png 786w, https://miro.medium.com/max/828/1*16oUsGj2yN8wxG1U0GrpqA.png 828w, https://miro.medium.com/max/1100/1*16oUsGj2yN8wxG1U0GrpqA.png 1100w, https://miro.medium.com/max/1400/1*16oUsGj2yN8wxG1U0GrpqA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce kw kx c" role="presentation" src="https://miro.medium.com/max/1400/1*16oUsGj2yN8wxG1U0GrpqA.png" alt="" width="700" height="138" /></picture></div>
</div>
</figure>
<p id="601d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">In case of deviation between the estimated start date and the actual start date a risk must be raised accordingly.</p>
<p id="5749" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Be proactive in identify any risks anticipated for testing the release. Types of risks could be-</p>
<p id="ac05" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· a resource crunch,</p>
<p id="0175" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· test data unavailability,</p>
<p id="f0a7" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· environment instability,</p>
<p id="dcb0" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· slippage of code delivery date, etc.</p>
<p id="4d45" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Any risk mentioned should also come with a mitigation which will be the backup plan for solving the risk. For example, in a resource crunch situation, either hire a new resource or borrow one from another team of the same project account or in worse case ask if the existing resources can extend their shift or log in during the weekends (which is a bad idea ).</p>
<p id="911c" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">If test data is not available in lower environment, think about refreshing the database of production with that of lower environment.</p>
<p id="7cab" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Work with the infra team to make the environment stable.</p>
<p id="f62e" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Be ready with a priority 1 and priority 2 suite of test cases if the code delivery date is slipped so that with the available time only the priority 1 test cases can be finished successfully.</p>
<p id="f55d" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Entry Criteria section must be strictly followed as an acceptance criteria to begin testing.</p>
<p id="5cbc" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· Stable code without severity 1 and 2 defects,</p>
<p id="3105" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· Demo of the functionality and code walkthrough by developer,</p>
<p id="cca6" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· presence of integration and unit tests at code level, etc. are a few to name.</p>
<p id="7093" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">Exit Criteria is very essential to understand if the code is of good quality and can be promoted to the higher levels. For example:</p>
<p id="0158" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· successful execution of all designed test cases,</p>
<p id="8247" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· absence of any severity 1 and 2 defects,</p>
<p id="6ad5" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· severity 3 bugs if present must have a workaround,</p>
<p id="2338" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· defects that might not hamper the existing code and can be taken in the next or subsequent releases must be deferred after review with stake holders,</p>
<p id="d6be" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph="">· demo of a happy path to end user.</p>
<p id="2bee" class="pw-post-body-paragraph jr js iu jt b ju jv jw jx jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn ko in fw" data-selectable-paragraph=""><em class="kp">Now that we are clear with the difference between both the documents, it is left to respective teams to decide and incorporate this documentation process based on time/requirement constraints.</em></p>
<p data-selectable-paragraph="">
<p data-selectable-paragraph="">Originally published on <a href="https://medium.com/falabellatechnology/test-strategy-and-test-plan-ac39befaf16f">Medium</a></p><p>The post <a href="https://falabellaindia.com/portfolio/test-strategy-and-test-plan/">Test strategy and Test plan</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Why Spark ruling Data Space !!</title>
		<link>https://falabellaindia.com/portfolio/digital-campaigns/</link>
		
		<dc:creator><![CDATA[admin]]></dc:creator>
		<pubDate>Wed, 08 Dec 2021 04:38:19 +0000</pubDate>
				<guid isPermaLink="false">http://layerdrops.com/miboozwp/portfolio/marketing-advice-copy-copy-copy-copy/</guid>

					<description><![CDATA[<p>One thought always comes to our mind that “What is Spark ? and why Industries are running behind it ?, also why Spark is ruling data space ?”. So hang tight and let me start with very funny example . Think of a beehive. You have a single queen and hundreds or thousands of worker bees. They have very distinct roles. The queen is largely responsible [&#8230;]</p>
<p>The post <a href="https://falabellaindia.com/portfolio/digital-campaigns/">Why Spark ruling Data Space !!</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></description>
										<content:encoded><![CDATA[<div class="">
<h1 id="62a3" class="pw-post-title is it iu bm iv iw ix iy iz ja jb jc jd je jf jg jh ji jj jk jl jm jn jo jp jq fw" data-selectable-paragraph=""></h1>
</div>
<figure class="gl gn js jt ju jv gh gi paragraph-image">
<div class="jw jx do jy ce jz" tabindex="0" role="button">
<div class="gh gi jr"><picture><source srcset="https://miro.medium.com/max/640/1*BqD1Rrq3a6w_d_-hYDMsQQ.jpeg 640w, https://miro.medium.com/max/720/1*BqD1Rrq3a6w_d_-hYDMsQQ.jpeg 720w, https://miro.medium.com/max/750/1*BqD1Rrq3a6w_d_-hYDMsQQ.jpeg 750w, https://miro.medium.com/max/786/1*BqD1Rrq3a6w_d_-hYDMsQQ.jpeg 786w, https://miro.medium.com/max/828/1*BqD1Rrq3a6w_d_-hYDMsQQ.jpeg 828w, https://miro.medium.com/max/1100/1*BqD1Rrq3a6w_d_-hYDMsQQ.jpeg 1100w, https://miro.medium.com/max/1400/1*BqD1Rrq3a6w_d_-hYDMsQQ.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce ka kb c" src="https://miro.medium.com/max/1400/1*BqD1Rrq3a6w_d_-hYDMsQQ.jpeg" alt="Walk with Spark" width="700" height="468" /></picture></div>
</div>
</figure>
<p id="9ebe" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">One thought always comes to our mind that “What is Spark ? and why Industries are running behind it ?, also why Spark is ruling data space ?”. So hang tight and let me start with very funny example . <strong class="ke iv">Think of a beehive.</strong> You have a <strong class="ke iv"><em class="la">single</em></strong> <strong class="ke iv"><em class="la">queen</em></strong> and hundreds or <strong class="ke iv"><em class="la">thousands</em></strong> of <strong class="ke iv"><em class="la">worker</em></strong> <strong class="ke iv"><em class="la">bees</em></strong>. They have very <strong class="ke iv"><em class="la">distinct</em></strong> roles. The queen is largely responsible for the ‘<strong class="ke iv"><em class="la">brains</em></strong>’ of the entire operation, conducting the orchestra of worker bees who are fulfilling the <strong class="ke iv"><em class="la">hundreds</em></strong> of <strong class="ke iv"><em class="la">tasks</em></strong> that need to be accomplished. The worker bees are the <strong class="ke iv"><em class="la">executors</em></strong>, putting in the work required to accomplish those tasks.</p>
<p id="0afe" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv"><em class="la">What do Bees have to do with Apache Spark?</em></strong><br />
I feel my beehive example confused you more but be patient and let me help you to clear your confusion. They’ll come in handy soon. <strong class="ke iv"><em class="la">The formal definition of Apache Spark is that it is a general-purpose distributed data processing engine</em></strong>. It is also known as a cluster <strong class="ke iv"><em class="la">computing framework</em></strong> for large scale data processing.</p>
<p id="ead3" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">Spark is built to handle extremely large scale data. The sheer amount of data being loaded into the spark application is enough to overwhelm almost any computer. To handle that, Spark utilizes multiple computers (called a cluster) to process the tasks required for that job and work together to produce the desired output.</p>
<p id="0fe5" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">This is where the bee analogy comes in. <strong class="ke iv"><em class="la">Let’s start with a diagram.</em></strong></p>
<figure class="lc ld le lf gt jv gh gi paragraph-image">
<div class="jw jx do jy ce jz" tabindex="0" role="button">
<div class="gh gi lb"><picture><source srcset="https://miro.medium.com/max/640/1*VclU-cSmPCQxq6uMLDwxPA.png 640w, https://miro.medium.com/max/720/1*VclU-cSmPCQxq6uMLDwxPA.png 720w, https://miro.medium.com/max/750/1*VclU-cSmPCQxq6uMLDwxPA.png 750w, https://miro.medium.com/max/786/1*VclU-cSmPCQxq6uMLDwxPA.png 786w, https://miro.medium.com/max/828/1*VclU-cSmPCQxq6uMLDwxPA.png 828w, https://miro.medium.com/max/1100/1*VclU-cSmPCQxq6uMLDwxPA.png 1100w, https://miro.medium.com/max/1400/1*VclU-cSmPCQxq6uMLDwxPA.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce ka kb c" role="presentation" src="https://miro.medium.com/max/1400/1*VclU-cSmPCQxq6uMLDwxPA.png" alt="" width="700" height="394" /></picture></div>
</div><figcaption class="lg bl gj gh gi lh li bm b bn bo cn" data-selectable-paragraph="">ref : Nick Rafferty</figcaption></figure>
<p id="8979" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">You will be laughing with this comparison. But let me explain you this diagram:</p>
<p id="aaf6" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">Spark has two main components, Spark driver and spark executors same as beeline queen and worker bees.</p>
<p id="bb86" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv">Spark Driver:</strong> The Queen Bee of the operation. The Spark Driver is responsible for generating the Spark Context. The Spark Context is extremely important since it is the entryway into all of Spark’s functionality. Using the Cluster Resource Manager (typically YARN, Mesos, or Standalone), the Driver will access and divide work between the cluster of Spark Executors (worker nodes). The Spark Driver is where the main method is run, meaning that any written program will first interact with the driver before being sent in the form of tasks to the worker nodes.</p>
<p id="1df3" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv">Spark Executors: </strong>The worker bees. The executors are responsible for completing the tasks assigned to them by the driver with the help of the Cluster Resource Manager. As they perform the tasks instructed to them, they will store the results in memory, referred to as a cache. If any one of these nodes crashes, the task assigned to that executor will be reassigned to another node to complete the task. Every node can have up to one executor per core. Results are then returned to the Spark Driver upon completion.</p>
<p id="77ea" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">Apache Spark also takes advantage of in-memory caching for fast analytic queries for any data size. An in-memory cache is designed to store data in RAM and not on disk. You can use languages like: Scala, Python, R, and SQL to leverage Apache Spark.</p>
<p id="726e" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv">Note: Spark is not a programming language.</strong> It is a general-purpose distributed data processing engine. Spark is written in Scala, a functional programming language. Spark, fortunately has a great Python integration called PySpark (which is what I use mostly) — it lets me interface with the framework in any which way I want.</p>
<h1 id="4523" class="lj lk iu bm ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg fw" data-selectable-paragraph=""><strong class="ba">How Apache Spark works ?</strong></h1>
<p id="42a7" class="pw-post-body-paragraph kc kd iu ke b kf mh kh ki kj mi kl km kn mj kp kq kr mk kt ku kv ml kx ky kz in fw" data-selectable-paragraph="">Now, you will be curious to know that how it works. So, let’s take a<br />
further deep dive and understand spark architecture:</p>
<figure class="lc ld le lf gt jv gh gi paragraph-image">
<div class="jw jx do jy ce jz" tabindex="0" role="button">
<div class="gh gi mm"><picture><source srcset="https://miro.medium.com/max/640/1*oN97ICf-APGMaBK7L_A5oA.jpeg 640w, https://miro.medium.com/max/720/1*oN97ICf-APGMaBK7L_A5oA.jpeg 720w, https://miro.medium.com/max/750/1*oN97ICf-APGMaBK7L_A5oA.jpeg 750w, https://miro.medium.com/max/786/1*oN97ICf-APGMaBK7L_A5oA.jpeg 786w, https://miro.medium.com/max/828/1*oN97ICf-APGMaBK7L_A5oA.jpeg 828w, https://miro.medium.com/max/1100/1*oN97ICf-APGMaBK7L_A5oA.jpeg 1100w, https://miro.medium.com/max/1400/1*oN97ICf-APGMaBK7L_A5oA.jpeg 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce ka kb c" role="presentation" src="https://miro.medium.com/max/1400/1*oN97ICf-APGMaBK7L_A5oA.jpeg" alt="" width="700" height="357" /></picture></div>
</div>
</figure>
<p id="bf9c" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">The first block you see is the driver program. Once you do a Spark submit, a driver program is launched and this requests for resources to the cluster manager and at the same time the main program of the user function of the user processing program is initiated by the driver program.</p>
<p id="9e9b" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">Based on that, the execution logic is processed and parallelly Spark context is also created. Using the Spark context, the different transformations and actions are processed. So, till the time the action is not encountered, all the transformations will go into the Spark context in the form of DAG that will create RDD lineage.</p>
<p id="69a2" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">Once the action is called job is created. Job is the collection of different task stages. Once these tasks are created, they are launched by the cluster manager on the worker nodes and this is done with the help of a class called task scheduler.</p>
<p id="cb2f" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">The conversion of RDD lineage into tasks is done by the DAG scheduler. Here DAG is created based on the different transformations in the program and once the action is called these are split into different stages of tasks and submitted to the task scheduler as tasks become ready.</p>
<p id="3cbc" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">Then these are launched on the different executors in the worker node through the cluster manager. The entire resource allocation and the tracking of the jobs and tasks are performed by the cluster manager.</p>
<p id="39e2" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">As soon as you do a Spark submit, your user program and other configuration mentioned are copied onto all the available nodes in the cluster. So that the program becomes the local read on all the worker nodes. Hence, the parallel executors running on the different worker nodes do not have to do any kind of network routing. This is how theentire execution of the Spark job happens.</p>
<p id="b595" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv">Differences b/w traditional Hadoop map-reduce and Spark: </strong>For those of you who have traditionally worked on map-reduce or Hadoop, you might wonder why Spark? Spark has a big margin of advantage starting from speed to easy of writing. Some of the stark differences are:</p>
<figure class="lc ld le lf gt jv gh gi paragraph-image">
<div class="jw jx do jy ce jz" tabindex="0" role="button">
<div class="gh gi mn"><picture><source srcset="https://miro.medium.com/max/640/1*UVBjqnrf82xBVoXO6qfQdQ.png 640w, https://miro.medium.com/max/720/1*UVBjqnrf82xBVoXO6qfQdQ.png 720w, https://miro.medium.com/max/750/1*UVBjqnrf82xBVoXO6qfQdQ.png 750w, https://miro.medium.com/max/786/1*UVBjqnrf82xBVoXO6qfQdQ.png 786w, https://miro.medium.com/max/828/1*UVBjqnrf82xBVoXO6qfQdQ.png 828w, https://miro.medium.com/max/1100/1*UVBjqnrf82xBVoXO6qfQdQ.png 1100w, https://miro.medium.com/max/1400/1*UVBjqnrf82xBVoXO6qfQdQ.png 1400w" sizes="(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px" data-testid="og" /><img loading="lazy" decoding="async" class="ce ka kb c" role="presentation" src="https://miro.medium.com/max/1400/1*UVBjqnrf82xBVoXO6qfQdQ.png" alt="" width="700" height="305" /></picture></div>
</div>
</figure>
<h1 id="2912" class="lj lk iu bm ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mf mg fw" data-selectable-paragraph="">Why its ruling ?</h1>
<p id="74d3" class="pw-post-body-paragraph kc kd iu ke b kf mh kh ki kj mi kl km kn mj kp kq kr mk kt ku kv ml kx ky kz in fw" data-selectable-paragraph="">There are many reasons for its ruling but I would like to list out few of them. First let me tell you that “What makes spark so special?” So, this is a simple three-fold answer. 1. SPEED, 2.EASE OF USE, 3. A UNIFIED ENGINE.</p>
<p id="8831" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">But let’s deep dive into few reach features.</p>
<ol class="">
<li id="ee36" class="mo mp iu ke b kf kg kj kk kn mq kr mr kv ms kz mt mu mv mw fw" data-selectable-paragraph=""><strong class="ke iv"><em class="la">Fast processing:</em>– </strong>The most important feature of Apache Spark that has made the big data world choose this technology over others is its speed. Big data is characterized by volume, variety, velocity, and veracity which needs to be processed at a higher speed. Spark contains Resilient Distributed Dataset (RDD) which saves time in reading and writing operations, allowing it to run almost ten to one hundred times faster than Hadoop.</li>
<li id="c04f" class="mo mp iu ke b kf mx kj my kn mz kr na kv nb kz mt mu mv mw fw" data-selectable-paragraph=""><strong class="ke iv"><em class="la">Flexibility</em></strong>:- Apache Spark supports multiple languages and allows the developers to write applications in Java, Scala, R, or Python.</li>
<li id="126e" class="mo mp iu ke b kf mx kj my kn mz kr na kv nb kz mt mu mv mw fw" data-selectable-paragraph=""><strong class="ke iv"><em class="la">In-memory computing</em></strong>:- Spark stores the data in the RAM of servers which allows quick access and in turn accelerates the speed of analytics.</li>
<li id="252c" class="mo mp iu ke b kf mx kj my kn mz kr na kv nb kz mt mu mv mw fw" data-selectable-paragraph=""><strong class="ke iv"><em class="la">Real-time processing:- </em></strong>Spark is able to process real-time streaming data. Unlike MapReduce which processes only stored data, Spark is able to process real-time data and is, therefore, able to produce instant outcomes.</li>
<li id="fb1c" class="mo mp iu ke b kf mx kj my kn mz kr na kv nb kz mt mu mv mw fw" data-selectable-paragraph=""><strong class="ke iv"><em class="la">Better analytics:- </em></strong>In contrast to MapReduce that includes Map and Reduce functions, Spark includes much more than that. Apache Spark consists of a rich set of SQL queries, machine learning algorithms, complex analytics, etc. With all these functionalities, analytics can be performed in a better fashion with the help of Spark.</li>
</ol>
<p id="ac2f" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv"><em class="la">Scope:</em></strong> There are almost all the area and everyone can use spark in data space. Spark is a one-stop solution for many Big data-related problems. Some of the common use cases for Spark are:<br />
1. Batch Processing (ETL)<br />
2. Real time, well almost real time.<br />
3. ML and Deep Learning</p>
<p><strong class="ke iv">Data scientists</strong> use Apache Spark to perform advanced data analytics. Python brings an extensive set of advanced analytical functions that can be performed on data in Spark. Python is one of the more popular languages of the data science community and is also supported by Spark via a toolset called <a class="au nc" href="https://spark.apache.org/docs/0.9.0/python-programming-guide.html" target="_blank" rel="noopener ugc nofollow">pySpark</a>.</p>
<p id="db15" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv">Data engineers</strong> are data designers or builders. They usually assist data scientists and application developers in the data curation journey. They develop the architecture for the organization based on use cases and needs.</p>
<p id="9569" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv">Application developers</strong> can build solutions using Apache Spark. These applications are generally for analytical and business intelligence purposes. Spark is great for data analysis style applications and not for transaction processing applications.</p>
<p id="b4b8" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv">BOTTOM LINE:</strong> Apache Spark was built in 2009 and made resilient by 2012 because MapReduce was slow and complicated, and people wanted something faster and easier. Apache Spark is fast and is far simpler to program than MapReduce. Imagine shoving a bunch of data into computer memory and being able to read it, process it, or do something rapidly. That is Apache Spark.<strong class="ke iv"><em class="la">In layman’s terms MapReduce was slow! SLOW + BIG DATA = NO JOY, thus we get Spark.</em></strong></p>
<p id="1809" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph="">Apache Spark requires a decent amount of technical knowledge to make work. The average business person will require lots of help to get running on Apache Spark (being very generous). It is for programmers, data scientists, and highly technical unicorns.</p>
<p id="56aa" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv">CONCLUSION:</strong> Spark is an incredible technology and this article only touches the surface. I didn’t go into RDDs and other aspects but maybe in a future article I will. Hope to see you next time when I talk about installing and using Apache Spark in Kubernetes!</p>
<p id="983a" class="pw-post-body-paragraph kc kd iu ke b kf kg kh ki kj kk kl km kn ko kp kq kr ks kt ku kv kw kx ky kz in fw" data-selectable-paragraph=""><strong class="ke iv"><em class="la">Thanks for reading, cheers!</em></strong></p>
<p data-selectable-paragraph="">Originally published on <a href="https://medium.com/falabellatechnology/why-spark-ruling-data-space-4c3da8b0810e">Medium</a></p><p>The post <a href="https://falabellaindia.com/portfolio/digital-campaigns/">Why Spark ruling Data Space !!</a> first appeared on <a href="https://falabellaindia.com">Falabella</a>.</p>]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
