“Subrangshu worked for me the last two years. He demonstrated technical leadership, passion and desire to excel. His team delivered to commitments, drove change and championed a number of innovations. Subrangshu is able to form professional relationships, collaborates but is firm in keeping folks accountable. I have a high regard for him and strongly recommend him.”
About
Currently I am working as a Director, Silicon at Google.
Prior to this, as…
Activity
-
It was wonderful to connect with 100's of colleagues (past & present) at TI India's 40th anniversary. The whole sentiment and emotion was similar to…
It was wonderful to connect with 100's of colleagues (past & present) at TI India's 40th anniversary. The whole sentiment and emotion was similar to…
Liked by Subrangshu Das
-
'I'm not going back': Billionaire Marc Benioff says he's switching to Google's Gemini 3 after using 'ChatGPT every day for three years' | Fortune…
'I'm not going back': Billionaire Marc Benioff says he's switching to Google's Gemini 3 after using 'ChatGPT every day for three years' | Fortune…
Liked by Subrangshu Das
-
It was great to be on the panel “Fabless Innovation & Chip Design Leadership" at the Bengaluru Tech Summit today alongside distinguished leaders…
It was great to be on the panel “Fabless Innovation & Chip Design Leadership" at the Bengaluru Tech Summit today alongside distinguished leaders…
Liked by Subrangshu Das
Experience
Education
Licenses & Certifications
-
NVIDIA DLI Certificate – Fundamentals of Deep Learning
NVIDIA Deep Learning Institute
-
-
-
-
-
Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization
Coursera
IssuedCredential ID ZF2UYV27YNSM -
-
Publications
-
A true multistandard, programmable, low-power, full HD video-codec engine for smartphone SoC
International Solid State Circuits Conference (ISSCC)
In this paper, we present IVA-HD, a true multistandard, programmable, full HD video coding engine which adopts optimal hardware-software partitioning to achieve the low-power and area requirements of the OMAP 4 processor. Unlike the approach of using separate IPs for encoder and decoder, IVA-HD uses an integrated codec engine which is area efficient, as most of the decoder logic is reused for the encoder. IVA-HD is architected to perform stream-rate and pixel- rate processing in a single…
In this paper, we present IVA-HD, a true multistandard, programmable, full HD video coding engine which adopts optimal hardware-software partitioning to achieve the low-power and area requirements of the OMAP 4 processor. Unlike the approach of using separate IPs for encoder and decoder, IVA-HD uses an integrated codec engine which is area efficient, as most of the decoder logic is reused for the encoder. IVA-HD is architected to perform stream-rate and pixel- rate processing in a single pipeline (that processes one 16x16 macroblock at a time), so as to support the latency requirements of video conferencing.
Other authorsSee publication -
PowerAdviser: An RTL power platform for interactive sequential optimizations
Proceedings of the Conference on Design, Automation and Test in Europe (DATE)
Power has become the overriding concern for most modern electronic applications today. To reduce clock power, sequential clock gating is increasingly getting used over and above combinational clock gating. Given the complexity of manually identifying sequential clock gating changes, automatic tools are becoming popular. However, since these tools always work within the scope of the design and the constraints provided, they do not provide any insight into additional power savings that might…
Power has become the overriding concern for most modern electronic applications today. To reduce clock power, sequential clock gating is increasingly getting used over and above combinational clock gating. Given the complexity of manually identifying sequential clock gating changes, automatic tools are becoming popular. However, since these tools always work within the scope of the design and the constraints provided, they do not provide any insight into additional power savings that might still be possible. In this paper we present an interactive sequential analysis flow, PowerAdviser, which besides performing automatic sequential changes also provides information for additional power savings that the user can realize through manual changes. Using this new flow we have achieved dynamic power reduction upto 45% more than a purely automated flow.
Other authorsSee publication -
Reducing Dynamic Power with Gate-level Clock-Gating Optimization
Design Automation Conference (DAC)
Reducing power consumption, improving battery life and ultimately reducing the carbon footprint of a device is becoming one of the most important care-about in digital design today. To reduce clock and flip-flop power (which is usually the most energy consuming component in our designs today), gate-level clock-gating techniques such as those found in the Azuro PowerCentric tool, are increasingly getting used over and above traditional RTL clock-gating. This technique is very effective in…
Reducing power consumption, improving battery life and ultimately reducing the carbon footprint of a device is becoming one of the most important care-about in digital design today. To reduce clock and flip-flop power (which is usually the most energy consuming component in our designs today), gate-level clock-gating techniques such as those found in the Azuro PowerCentric tool, are increasingly getting used over and above traditional RTL clock-gating. This technique is very effective in reducing clock and flip-flop power by a further 20-30% beyond RTL clock-gating. However, unlike RTL clock gating, gate-level clock-gating introduces new logic paths in the design, which are redundant and added at the end of the datapath. A good scheme needs to be put in place to prevent this added logic from creating functional, testability, timing closure and gate-level simulation/equivalence verification issues. In this paper, we describe a robust implementation methodology that was used to successfully implement gate-level clock-gating using Azuro PowerCentric tool on 45nm low-power multi-media IP and reduce clock/flip-flop power by 22%.
Other authors -
RTL Power Optimization in Sequential Analysis Platforms
Design Automation Conference (DAC)
In this work we present an automated approach for RTL power optimization using sequential analysis. The approach analyzes pipelined datapaths in both forward and backward directions of clock cycles; based on which it derives conditions for which sequential stages can be clock gated. This approach
becomes all the more critical, when the source RTL is machine generated (also called Electronic System Level; viz. ESL) and manual analysis of RTL is not possible. This approach has been deployed…In this work we present an automated approach for RTL power optimization using sequential analysis. The approach analyzes pipelined datapaths in both forward and backward directions of clock cycles; based on which it derives conditions for which sequential stages can be clock gated. This approach
becomes all the more critical, when the source RTL is machine generated (also called Electronic System Level; viz. ESL) and manual analysis of RTL is not possible. This approach has been deployed for various datapath oriented designs in 45nm technology node. Using this technique, we have achieved power reduction in the order of 15% on top of low power synthesis solutions. We also report power optimization, obtained through an interactive mode, where opportunities were computed by sequential analysis, outside the aegis of automatic RTL modification. We also showcase an approach which combines solutions in the domain of Power Optimization, Estimation, and Logic
Simulation to arrive at an integrated methodology for in-the loop power optimization.Other authors -
DFT Challenges in Next Generation Multi-media IP
Asian Test Symposium (ATS)
Multi-media based applications have increased immensely in the last few years. The need to have better video quality, higher recording and playback time, more video channels and faster time to market (TTM) requires DFT solutions that use core-based testing to allow concurrent IP and SOC development, scalable to support multiple technologies and eases the development of timing constraints. This paper describes the challenges and solutions used to address them.
Other authorsSee publication -
The Automatic Generation of Merged-Mode Design Constraints
Design Automation Conference (DAC)
Multi-mode timing closure is a latent design issue that critically impacts the performance and schedule of our designs today. Even though P&R tools today support concurrent optimization of the design across multiple timing modes, our experience suggests that these solutions start to choke beyond 2-3 modes. One way to solve this issue has been to manually develop “merged-mode” constraints, which effectively capture the timing requirements across multiple different operating modes of the design…
Multi-mode timing closure is a latent design issue that critically impacts the performance and schedule of our designs today. Even though P&R tools today support concurrent optimization of the design across multiple timing modes, our experience suggests that these solutions start to choke beyond 2-3 modes. One way to solve this issue has been to manually develop “merged-mode” constraints, which effectively capture the timing requirements across multiple different operating modes of the design. But without a way to evaluate the completeness and accuracy of the “merged-mode” timing constraints, it often becomes necessary to fix the constraints late in the design flow – causing undesirable slip in design schedules. In order to circumvent this issue and generate constraints that are correct-by-construction, Company A and Company B have worked together to develop an automated technique using Company B’s tool (Product A) to generate “merged-mode” constraints. The input to this flow is a mode-table spreadsheet, which captures the complete list of operating modes supported by the design and the configuration settings required to put the design into the corresponding operating mode. This approach also helped reduce the cycle-time of developing high-quality merged-mode constraints by 2-3X from manually merged-constraints.
Other authors
Patents
-
Dynamic frame padding in a video hardware engine
Issued US 10547859
See patentA video hardware engine which support dynamic frame padding is disclosed. The video hardware engine includes an external memory. The external memory stores a reference frame. The reference frame includes a plurality of reference pixels. A motion estimation (ME) engine receives a current LCU (largest coding unit), and defines a search area around the current LCU for motion estimation. The ME engine receives a set of reference pixels corresponding to the current LCU. The set of reference pixels…
A video hardware engine which support dynamic frame padding is disclosed. The video hardware engine includes an external memory. The external memory stores a reference frame. The reference frame includes a plurality of reference pixels. A motion estimation (ME) engine receives a current LCU (largest coding unit), and defines a search area around the current LCU for motion estimation. The ME engine receives a set of reference pixels corresponding to the current LCU. The set of reference pixels of the plurality of reference pixels are received from the external memory. The ME engine pads a set of duplicate pixels along an edge of the reference frame when a part area of the search area is outside the reference frame.
-
Low power ultra-HD video hardware engine
Issued US 9973754
A low power video hardware engine is disclosed. The video hardware engine includes a video hardware accelerator unit. A shared memory is coupled to the video hardware accelerator unit, and a scrambler is coupled to the shared memory. A vDMA (video direct memory access) engine is coupled to the scrambler, and an external memory is coupled to the vDMA engine. The scrambler receives an LCU (largest coding unit) from the vDMA engine. The LCU comprises N.times.N pixels, and the scrambler scrambles…
A low power video hardware engine is disclosed. The video hardware engine includes a video hardware accelerator unit. A shared memory is coupled to the video hardware accelerator unit, and a scrambler is coupled to the shared memory. A vDMA (video direct memory access) engine is coupled to the scrambler, and an external memory is coupled to the vDMA engine. The scrambler receives an LCU (largest coding unit) from the vDMA engine. The LCU comprises N.times.N pixels, and the scrambler scrambles N.times.N pixels in the LCU to generate a plurality of blocks with M.times.M pixels. N and M are integers and M is less than N.
Other inventorsSee patent -
System and method for managing cache
Issued US US9430393
A system includes first and second processing components, a qualified based splitter component, a first and second configurable cache element and an arbiter component. The first data processing component generates a first request for a first portion of data at a first location within a memory. The second data processing component generates a second request for a second portion of data at a second location within the memory. The qualifier based splitter component routes the first request and the…
A system includes first and second processing components, a qualified based splitter component, a first and second configurable cache element and an arbiter component. The first data processing component generates a first request for a first portion of data at a first location within a memory. The second data processing component generates a second request for a second portion of data at a second location within the memory. The qualifier based splitter component routes the first request and the second request based on a qualifier. The first configurable cache element enables or disables prefetching data within a first region of the memory. The second configurable cache element enables or disables prefetching data within a second region of the memory. The arbiter component routes the first request and the second request to the memory.
Other inventorsSee patent -
Method to hide or reduce access latency of a slow peripheral in a pipelined direct memory access system
Issued US 7673091
A bus bridge between a high speed DMA bus and a lower speed peripheral bus sets a threshold for minimum available buffer space to send a read request dependent upon a frequency ratio and the DMA read latency. Similarly, a threshold for minimum available data for a write request depends on the frequency ratio and the DMA write latency. The bus bridge can store programmable values for the DMA read latency and write latency.
Other inventors -
-
Software power control of circuit modules in a shared and distributed DMA system
Issued US 7321980
A system-on-chip integrated circuit selectively gates clocks to individual modules corresponding to the state of a corresponding bit of a peripheral enable register. A reset circuit supplies a signal to a reset input of the digital module for a normal mode if the bit indicates the power-up state and a reset mode if the bit indicates a power-down state. Return to normal mode is delayed a predetermined time after the said bit of indicates the power-up state to ensure clean power up. A false…
A system-on-chip integrated circuit selectively gates clocks to individual modules corresponding to the state of a corresponding bit of a peripheral enable register. A reset circuit supplies a signal to a reset input of the digital module for a normal mode if the bit indicates the power-up state and a reset mode if the bit indicates a power-down state. Return to normal mode is delayed a predetermined time after the said bit of indicates the power-up state to ensure clean power up. A false acknowledge circuit for each module supplies an acknowledge signal in response to a received command if the corresponding bit indicates the power-down state.
Other inventors -
Software controlled hard reset of mastering IPs
Issued US 7315905
A system-on-chip integrated circuit includes a peripheral initialization register has a bit corresponding to each module. Each bit indicates a normal mode or a reset mode for the corresponding module. A direct memory access unit can receive, prioritize and queue date movement transactions between modules and can read from or write to the peripheral initialization register. A peripheral interface unit prevents a write to the peripheral initialization register changing a module from reset mode to…
A system-on-chip integrated circuit includes a peripheral initialization register has a bit corresponding to each module. Each bit indicates a normal mode or a reset mode for the corresponding module. A direct memory access unit can receive, prioritize and queue date movement transactions between modules and can read from or write to the peripheral initialization register. A peripheral interface unit prevents a write to the peripheral initialization register changing a module from reset mode to normal mode while there is an uncompleted data movement transaction involving that module. A false acknowledge circuit for each module supplies an acknowledge signal in response to a received command if the module is in reset mode.
Other inventors -
Method for automating validation of integrated circuit test logic
Issued US 6553524
A methodology for automatic validation of integrated circuit (IC) test hardware that is performed during extraction of the test hardware. Signal connectivity between output test ports of one or more test control blocks and serially-connected scan latches of the test hardware is automatically validated, as is inter-connectivity between the serially-connected scan latches. Every instance to which a test signal and a test data signal at an output test port (both test signal and test data ports) of…
A methodology for automatic validation of integrated circuit (IC) test hardware that is performed during extraction of the test hardware. Signal connectivity between output test ports of one or more test control blocks and serially-connected scan latches of the test hardware is automatically validated, as is inter-connectivity between the serially-connected scan latches. Every instance to which a test signal and a test data signal at an output test port (both test signal and test data ports) of a test control block fans out to is traversed until a scan latch is reached in order to provide electrical and functional verification of the test hardware.
Recommendations received
15 people have recommended Subrangshu
Join now to viewMore activity by Subrangshu
-
Team (TI SMPU DV team), hearty congratulations 🎊 Another Best Paper Award
Team (TI SMPU DV team), hearty congratulations 🎊 Another Best Paper Award
Liked by Subrangshu Das
-
𝗛𝗮𝗽𝗽𝘆 𝗜𝗻𝗱𝗲𝗽𝗲𝗻𝗱𝗲𝗻𝗰𝗲 𝗗𝗮𝘆 𝘁𝗼 𝘂𝘀 — 𝟭𝟯 𝘆𝗲𝗮𝗿𝘀! 🔥 Exactly 13 years ago, on the Monday of Thanksgiving week, I was told I…
𝗛𝗮𝗽𝗽𝘆 𝗜𝗻𝗱𝗲𝗽𝗲𝗻𝗱𝗲𝗻𝗰𝗲 𝗗𝗮𝘆 𝘁𝗼 𝘂𝘀 — 𝟭𝟯 𝘆𝗲𝗮𝗿𝘀! 🔥 Exactly 13 years ago, on the Monday of Thanksgiving week, I was told I…
Liked by Subrangshu Das
-
Akshayakalpa Organic: has great products and customer service, but I especially love the fact that they take back all the milk packets for recycling.…
Akshayakalpa Organic: has great products and customer service, but I especially love the fact that they take back all the milk packets for recycling.…
Liked by Subrangshu Das
-
Hello All, Wanted to provide a career update here 😊 After 29 incredible years at Intel, I made the difficult but thoughtful decision to retire…
Hello All, Wanted to provide a career update here 😊 After 29 incredible years at Intel, I made the difficult but thoughtful decision to retire…
Liked by Subrangshu Das
-
Was amazing to catch up with former colleagues at TI India’s 40th anniversary party. As Shraddha Jain (aiyyoShraddha) quipped on stage - you have to…
Was amazing to catch up with former colleagues at TI India’s 40th anniversary party. As Shraddha Jain (aiyyoShraddha) quipped on stage - you have to…
Liked by Subrangshu Das
-
TI will always remain close to my heart for innumerable stories like the one below.
TI will always remain close to my heart for innumerable stories like the one below.
Shared by Subrangshu Das
-
Data intelligence + scientific truth gives credibility to AI-FIRST manufacturing. Simulating advanced robotics physics to complex cleanroom chamber…
Data intelligence + scientific truth gives credibility to AI-FIRST manufacturing. Simulating advanced robotics physics to complex cleanroom chamber…
Liked by Subrangshu Das
-
𝑫𝒆𝒍𝒊𝒈𝒉𝒕𝒆𝒅 𝒕𝒐 𝒃𝒆 𝒔𝒑𝒆𝒂𝒌𝒊𝒏𝒈 𝒊𝒏 𝒕𝒉𝒆 𝑩𝒂𝒏𝒈𝒂𝒍𝒐𝒓𝒆 𝑳𝒊𝒕 𝑭𝒆𝒔𝒕 𝒐𝒏 𝒂 𝒕𝒉𝒆𝒎𝒆 𝒕𝒉𝒂𝒕 𝒊𝒔 𝒄𝒍𝒐𝒔𝒆 𝒕𝒐 𝒎𝒚…
𝑫𝒆𝒍𝒊𝒈𝒉𝒕𝒆𝒅 𝒕𝒐 𝒃𝒆 𝒔𝒑𝒆𝒂𝒌𝒊𝒏𝒈 𝒊𝒏 𝒕𝒉𝒆 𝑩𝒂𝒏𝒈𝒂𝒍𝒐𝒓𝒆 𝑳𝒊𝒕 𝑭𝒆𝒔𝒕 𝒐𝒏 𝒂 𝒕𝒉𝒆𝒎𝒆 𝒕𝒉𝒂𝒕 𝒊𝒔 𝒄𝒍𝒐𝒔𝒆 𝒕𝒐 𝒎𝒚…
Liked by Subrangshu Das
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Subrangshu Das in India
4 others named Subrangshu Das in India are on LinkedIn
See others named Subrangshu Das