>. tahnikahmed@portfolio:~$

>. tahnikahmed@portfolio:~$

>. tahnikahmed@portfolio:~$

alex.dev@portfolio:~$ article

CUDA Kernel Optimization for Transformer Inference

Exploring kernel fusion, memory coalescing, and custom CUDA kernels to achieve 3x speedup in transformer model inference on consumer GPUs.

Published in Category:

AI Systems

Published on:

Read time:

Reference Fields

To add Pagination, select your Collection List, click on Pagination, select one of the two options, then pick how many items to load. Pagination also works with existing Limits and Start Offsets. Both the Spinner and Button are completely customizable, and you can pick any Variant for their Loading states. The Spinner itself is just a layer with a conic gradient and a Loop Effect, so you get full control. Adding Pagination helps make your blogs and changelogs much faster to load, especially when they contain dozens of items.

Infinite Scrolling with custom Spinner component
Load More Button with custom Button component
Enjoy freeform positioning of both components
Design your own Loading and Hidden states
Make your CMS Pages much faster to load

Filtering

We've added the ability to filter your collection lists in the CMS. This allows you to keep your content in a single collection, yet customize how that collection is presented on each of your web pages. For example, if you're creating docs for your app, you might want to filter articles per topic on your homepage. Or when creating a blog, you might want to filter your blog posts per category.

MLOps

AI Engineering

AI Systems

Infrastructure

0 Minute Read

Ray Cluster, কেন?

এই সব কিছু আসলে একজন মেশিন লার্নিং ইঞ্জিনিয়ারের সমাধান করার কথা না, কিন্তু তারপরও তাকে নিজেই করতে হয় নতুবা প্ল্যাটফর্ম ইঞ্জিনিয়ারদের সাহায্য নিতে হয়। মজার ব্যাপার হচ্ছে, এভাবেই আমরা যাকে "মেশিন লার্নিং ইঞ্জিনিয়ারিং" বলছি, সেটা কিন্তু ধীরে ধীরে আল্টিমেটলি "সফটওয়্যার ইঞ্জিনিয়ারিং"-ই হয়ে যাচ্ছে। তো এই সিনারিওটা আমরা মেশিন লার্নিং ইঞ্জিনিয়ার হিসেবে কীভাবে সমাধান করতে পারি ? আমরা তা করতে পারি Ray ক্লাস্টার বিল্ড করার মাধ্যমে। Ray মূলত একটা ডিস্ট্রিবিউটেড কম্পিউটিং ফ্রেমওয়ার্ক, যেটা এই মেশিন লার্নিং ওয়ার্কলোড কে ক্লাস্টারে থাকা নোডগুলোর মধ্যে খুব ইফেশিয়েন্টলি ডিস্ট্রিবিউট করে দেয়।

Read the blog

MLOps

Infrastructure

15 Minute Read

𝗛𝗼𝘄 𝘁𝗼 𝗕𝘂𝗶𝗹𝗱 𝗘𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁 𝗮𝗻𝗱 𝗖𝗼𝘀𝘁 𝗘𝗳𝗳𝗲𝗰𝘁𝗶𝘃𝗲 𝗗𝗮𝘁𝗮 𝗪𝗮𝗿𝗲𝗵𝗼𝘂𝘀𝗲𝘀 𝗳𝗼𝗿 𝗦𝗠𝗕𝘀?

Modernizing data warehouses with a hybrid Azure approach enables centralized storage, real‑time analytics, and secure integration across tools like Synapse, Data Lake, Stream Analytics, and Power BI to deliver scalable insights and compliance‑ready infrastructure.

Read the blog

MLOps

Infrastructure

12 Minute Read

Building Production ML Pipelines with Kubernetes

A deep dive into designing fault-tolerant, scalable ML training and serving pipelines on K8s — from resource scheduling to model versioning.

Read the blog