SmartScan: An AI-based Interactive Framework for Automated Region Extraction from Satellite Images

Savinay Nagendra * Work completed during internship at Schlumberger-Doll Research, Cambridge, MA. 02319 during May-August 2023. Kashif Rashid Schlumberger-Doll Research, Cambridge, MA 02139

Abstract

The deployment of a continuous monitoring system for methane emission sources on a client facility entails establishing the optimal number and location of fixed point sensors. The planning process, however, can be labor intensive as it takes considerable effort to setup a site and run multiple iterations to fully capture client restrictions. In addition, this process is particularly time-consuming when many sites are to be evaluated, considerably hindering scalability.

Motivated by this, we introduce SmartScan, an AI framework that automates the extraction of pertinent data-sets necessary for optimal sensor placement design. Thus, the subspaces of interest are extracted from a satellite image of the site with an interactive tool that helps to quickly create facility-specific and problem-dependent constraint sets.

SmartScan employs the Segment Anything Model (SAM), a prompt-based transformer model for zero-shot segmentation for extracting sub-spaces (regions of interest) from any satellite image without need for explicit training. SmartScan has two modes of operation: (1) In the Data Curation Mode a satellite image is processed to extract high-quality sub-spaces. For this, SmartScan has a unique interactive prompting design to rapidly gather user-prompts for SAM. The extracted sub-spaces are utilized in downstream algorithms. (2) In the Autonomous Mode, user-prompts gathered in the data curation mode are used as ground-truth to train a novel deep learning network. The trained network is deployed to replace user-prompting. Here, the end-to-end subspace extraction process is completely autonomous. Subsequently, the interactive visualization and annotation tool is used for (1) to quality check and correct errors of the AI framework (e.g. to remove false positives that will affect the accuracy of downstream algorithms, and (2) to generate additional facility-specific constraint sets as required. SmartScan is streamlined for producing high-quality sub-space extraction with high throughput and minimal human supervision (quality check) with its novel end-to-end design and AI-based prompting mechanism, thus increasing scalability and efficiency of downstream algorithms. Notably, the design of SmartScan makes it suited for extracting regions of interest from any ultra high-resolution satellite imagery, making it domain agnostic.

Index Terms:

SmartScan, AI, Methane leak detection, Source-inversion, Segment Anything Model, Transformer, Zero-shot segmentation, Deep learning.

I Introduction

The oil and gas industry is facing an increasing demand to monitor its assets for methane leaks as part of efforts to reduce greenhouse gas emissions [1]. Methane is a potent greenhouse pollutant, with a global warming potential 84 times greater than that of carbon dioxide [1]. Approximately 20% of annual anthropogenic emissions can be attributed to the oil industry [2, 3]. These emissions fall into two main categories: intentional venting or unintentional fugitive leaks. Intentional venting occurs as a result of operational activities where methane is knowingly released into the atmosphere (e.g., resulting from the use of pneumatic natural gas valves or direct venting)[4]. While such leaks are undesirable, they can be addressed with revised work practices and the use of equipment that eliminates emissions. In contrast, fugitive leaks result from malfunctioning equipment such as wellheads, separators, compressors, and pipelines [4]. Recent research indicates that a small number of these leaks are responsible for a significant portion of total emissions [5, 6, 7]. Hence, there is a pressing need to rapidly identify and repair sources of methane pollution, and especially those identified as super-emitters[3].

Refer to caption — Figure 1: Example of a continuous methane leak monitoring system. An optimal number of sensors have been deployed at the facility according to the generated optimal placement design. Further, targeted source-inversion is used to determine the subspace inside which a leak occurs.

Continuous real-time monitoring has been identified as the most desirable strategy for detecting sources of methane leak, and is an absolute necessity for climate emissions control [3]. Such a continuous real-time monitoring system [3] developed at Schlumberger-Doll Research (SDR) utilizes permanently installed low cost metal-oxide methane sensors at a facility that can continuously monitor and identify leaks, with source localization and quantification methods to aid expedient remediation. The deployment of such a continuous monitoring system, as shown in Figure 1, entails (1) Defining subspaces (the regions containing potential sources of methane leak), (2) Establishing the optimal number and location of the fixed point sensors, and (3) Targeted source-inversion to find the subspace inside which a leak occurs during active monitoring.

The entire workflow of a continuous real-time monitoring system is shown in Figure 2. The client provides GPS coordinates of the facility which is given to module A as an input. In this module, a high resolution satellite image is extracted from the GPS coordinates. Subspaces (regions of potential leak) are drawn manually. Further, additional constraints such as site bounds, site perimeter, linear constraints and exclusion zones are also marked that collectively define feasible and infeasible regions for sensor placement. Subspaces, together with these constraints make a facility-specific problem constraint set. These are converted from pixel coordinate system to Cartesian coordinate system (CRS) and given to module B in the form of JSON files. Module B is for optimal sensor placement. First, wind rose [8] data is extracted given client-provided GPS coordinates to generate stochastic wind realizations. Next, leak points are sampled from the provided subspaces. Each leak point represents a potential leak source within the subspace that becomes a data point in the optimization. Next, a sensor design is created (number of sensors and their locations). Each leak point is activated and tested whether it is detectable by the set of the sensors in the current design. This is repeated for all candidate leak points to give the mean percentage coverage under wind uncertainty. The design is modified, and the process continues until the mean percentage coverage is maximized. The optimized sensor placement design is returned to A which converts the design back to GPS. This final design is conveyed for field implementation of the fixed point sensors at the facility. When the deployed sensors are activated, they start continuously monitoring the facility for leaks. Module C hosts the targeted source-inversion algorithm that finds the subspace in which a leak occurs (details can be found in [3]).

Well-defined subspaces are pertinent to the performance of both the optimal sensor placement and source inversion algorithms (together referred to as downstream algorithms). However, the planning process that entails manually defining subspaces and facility-specific constraint sets is labor intensive as it takes considerable effort to setup a site and run multiple iterations to fully capture client restrictions. This is particularly time-consuming when many facilities are to be evaluated and each is very different as shown in Figure 3. This bottleneck significantly hinders scalability.

In this work, we introduce the development of SmartScan, an AI-based framework[9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26] that automates the extraction of pertinent datasets necessary for optimal sensor placement design. Thus, subspaces of interest are extracted from a satellite image of the facility with an interactive tool that helps to quickly create facility-specific and problem-dependent constraint sets. SmartScan employs the Segment Anything Model (SAM) [27], a prompt-based transformer [28] model for zero-shot segmentation [29, 10, 12, 19] for extracting subspaces from any satellite image without need for explicit training. SmartScan has two modes of operation: (1) Data Curation Mode: A satellite image is processed to extract high-quality subspaces. For this, SmartScan has a unique interactive prompting design for rapidly providing user-prompts to SAM. Extracted sub-spaces are utilized further in downstream algorithms. (2) Autonomous Mode: User-prompts provided in the data curation mode are used as ground-truth to train a novel deep learning network. The trained network is deployed to replace user-prompting. Here, end-to-end subspace extraction process is completely autonomous. Subsequently, the interactive visualization and annotation tool is used for (1) quality check to correct errors of the AI framework (if any) which could affect the accuracy of downstream algorithms, and (2) generating facility-specific constraint sets. SmartScan is streamlined for producing high-quality sub-space extraction with high throughput and minimal human supervision (quality check) with its novel end-to-end design and AI-based prompting mechanism, thus increasing scalability and efficiency of downstream algorithms. Further, the design of SmartScan makes it suited for extracting regions of interest from any ultra high-resolution satellite imagery, making it domain agnostic.

The key contributions of this work are:

•

The development of SmartScan; an interactive AI-based framework for automated, few-shot and domain-agnostic segmentation of satellite imagery.
•

The Data Curation Mode of SmartScan is equipped with a novel, interactive prompting module to rapidly generate user-prompts for extracting high-quality segmentation maps from SAM.
•

The Autonomous Mode of SmartScan is equipped with a novel deep learning based prompt generator module, trained on user-prompts generated in the Data Curation Mode, that can replace user-prompting, enabling end-to-end high-quality segmentation.
•

The Quality Check module is equipped with an interactive annotation and visualization tool for generating high-quality subspaces.
•

We demonstrate that the proof-of-concept novel autonomous prompting module paired with SAM is a powerful scheme for achieving high-quality few-shot segmentation with high generalizability. This scheme is light-weight, easily trainable with few data points, and memory-efficient, while still being able to exceed or have on-par performance with supervised segmentation models.

II SmartScan

In this section, we present the SmartScan workflow.

II-A Segment Anything Model (SAM)

Segment Anything project [27] is a new task, model, and dataset for image segmentation. Using an efficient model in a data collection loop, Meta AI built the largest segmentation dataset to date (by far), with over 1 billion masks on 11M licensed and privacy respecting images. The pipeline is shown in Figure 4. The encoder is an encoder of ViT transformer for extraction. The model is designed and trained to be prompted, so it can transfer zero-shot to new image distributions and tasks. Meta AI evaluated its capabilities on numerous tasks and found that its zero-shot performance was impressive – often competitive with or even superior to prior fully supervised results. Meta AI released the Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and 11M images (around April 2023) to foster research into foundation models for computer vision.

II-A1 Image Encoder

Motivated by scalability and powerful pre-training methods, Segment Anything Model (SAM) uses an MAE [30] pre-trained Vision Transformer(ViT) [31] minimally adapted to process high resolution inputs [32]. The image encoder runs once per image and can be applied prior to prompting the model.

II-A2 Prompt Encoder

SAM uses two sets of prompts: sparse (given as points, boxes or text) and dense (as masks). Points and boxes are represented by positional encodings [33] summed with learned embeddings for each prompt type and free-form text with an off-the-shelf text encoder from CLIP [34]. Dense prompts (i.e., masks) are embedded using convolutions and summed element-wise with the image embedding. For this project, we are not considering text and dense (mask) prompts.

II-A3 Mask Decoder

The mask decoder efficiently maps the image embedding, prompt embeddings, and an output token to a mask. This design, inspired by [35, 36], employs a modification of a Transformer decoder block [37] followed by a dynamic mask prediction head. The modified decoder block uses prompt self-attention and cross-attention in two directions (prompt-to-image embedding and vice-versa) to update all embeddings. After running two blocks, the image embedding is up-sampled and an MLP maps the output token to a dynamic linear classifier, which then computes the mask foreground probability at each image location.

II-B SmartScan: Back-End

In this section, we discuss the components of the SmartScan pipeline, shown in Figure 5, and their functionalities. We will be discussing the Autonomous Mode of SmartScan, which is the current mode of implementation.

II-B1 Satellite Image Extraction

The client provides the GPS [38] coordinates of the facility/site where the continuous real-time methane monitoring system must be set up. With this location as the center, bottom-left and top-right GPS coordinates are calculated at a particular zoom-level in Google Maps [39], which collectively provides the extent of the site in meters.

The site bounds are converted to pixel domain, where $36$ instances of $512\times 512\times 3$ RGB images are stitched together to make one ultra-high-resolution satellite image of size $3072\times 3072\times 3$ as shown in Figure 6. This image is used to extract regions (subspaces) of potential methane leak sources within the site, and subsequently, to mark additional data necessary for planning purposes (e.g., exclusion zones, linear restrictions, site bounds and the imposed perimeter).

II-B2 Prompt Generation

In this section, we will first discuss how user prompts are collected in the Data Curation Mode of SmartScan. We will then discuss how this collected data from user-prompting is used to develop an autonomous prompting system, which will be used in the Autonomous Mode of SmartScan.

Manual Prompting System. After testing the different modes of SAM, as discussed in Section II-A, we note that both box and point prompts are required for high-quality segmentation outputs, where higher quality segmentation necessarily implies better performance of downstream algorithms.

We note that SAM is designed to take only one box, and multiple point prompts per image as inputs. However, given the high resolution ( $\approx\geq 9MB$ ) of satellite images, two key challenges are encountered: (1) Giving the entire image to SAM for processing will be significantly memory inefficient, and will fail to work on GPUs with less than 16GB memory capacity. (2) Further, giving a single box prompt for the entire image will result in poor segmentation quality. To alleviate both these challenges, we designed our box prompt system to be similar to CAPTCHA (typically used for web security) as shown in Figure 7. First, the satellite image is presented to the user with $256\times 256$ grids. The user clicks on boxes that cover the foreground object (site equipment) and saves the box prompts as a JSON file. Next, the marked box prompts of interest are loaded in another interface where the user marks one or more points in every box prompt. These are also saved as a JSON file. Each $256\times 256$ grid of the image that was marked by the user now becomes an input to SAM with its corresponding point prompts. This way, all the grids can be processed in parallel on a GPU with increased memory efficiency to obtain high segmentation quality.

A qualitative comparison of segmentation outputs from SAM with different modes of prompting is shown in Figure 8. It can be observed that our manual box + point promoting system achieves the best segmentation quality. This prompting system is used in the Data Curation Mode of SmartScan.

Autonomous Prompting System. The prompt JSON files stored in the Data Curation Mode contain valuable information regarding the most spatially-important regions of the satellite image. This information can be leveraged to train a light-weight neural network model to automatically pick spatially important regions from satellite images, thereby, learning to prompt SAM, instead of providing manual prompts. This idea is used in the Autonomous Prompting System.

•

Box Prompt Generator: The Box Prompt Generator is a simple convolutional binary classifier. It takes as input each $256\times 256$ grid of a satellite image and assigns a binary class to it: $0$ being not-of-interest and $1$ being of-interest. With this, all the grids containing regions of interest can be extracted for a given satellite image. Being a binary classifier, the model is light-weight, and can be quickly trained to convergence. The model also has high generalizability and can be trained with less data to provide very high accuracy, making it few-shot.

Figure 9: Point Prompt Generator. The Point Prompt Generator is an Autoencoder that takes as input an image grid and outputs a 2D Gaussian heat map that represents the distribution of point prompts in the grid. Finally, a customized peak finding algorithm is used to find the peak of the predicted Gaussians to get the point prompts.
•

Point Prompt Generator: A point prompt is a pixel coordinate $(x,y)$ in an image. Learning a single point using a neural network makes the training very rigid as the model is constrained to predict this single point accurately. A user would mark a point prompt at different pixel coordinates (but in a neighborhood) during multiple iterations of manual prompting. However, we empirically observe that the output of SAM is less sensitive to the location of the point prompt as long as it is within the object of interest. With this motivation, we create a 2D Gaussian heat map of the satellite image as target for training a neural network for learning point prompts. This heat map represents the distribution of all the point prompts, with the center of each Gaussian being the most desired point prompt. We choose the standard deviation of the 2D Gaussians such that segmentation outputs from SAM is not significantly affected when the point is chosen to be within 1 standard deviation away from the center. Our Point Prompt Generator is shown in Figure 9. We design the neural network to be an Autoencoder. where the input is an image grid, and output is the corresponding 2D Gaussian heat map. Finally, we employ a customized peak-finding algorithm to find the peak of all the predicted Gaussians to get the point prompts. Again, it has to be noted that this model is light-weight and easily trainable to convergence. The model is also highly generalizable and few-shot as well.

SAM, combined with our Autonomous Prompting System, can produce high quality segmentation maps, which are on-par or even better than task-specific semantic segmentation models trained from scratch. Qualitative comparative results showing the performance of our Autonomous Prompting system is shown in Figure 10. Two baseline prompt-predicting methods (center points or density-based) and one no-prompt (everything mode) are used for comparison. It can be observed that our Autonomous Prompting System (column 4) gives the best results (both qualitative and quantitative) as shown in Figure 10, with $\approx\geq 90\%$ mean Intersection over Union, even on unseen cases.

II-B3 Post-Processing for Subspace Extraction

After we get a binary segmentation map from SAM, the next step is to divide this map into subspaces, each made up of tightly-bound convex hulls. The post-processing involves the following steps:

•

Apply Conditional Random Fields (CRF) [40] to the segmentation map to get rid of spurious islands of pixels.
•

Extract contours of connected components using OpenCV.
•

Filter any left-out spurious islands of pixels by area of extracted contour.
•

Extract convex hulls from contours of connected components using Sklansky algorithm.
•

Make convex hulls tight around each object by reducing dead-space (background pixels).
•

Simplify the convex hulls using the Ramer–Douglas–Peucker [41] algorithm.

II-B4 Quality Check

The Quality Check step is necessary to correct the mistakes (if any) made by the AI models, and to add facility-specific constraint sets as needed for the placement problem of interest. For this, we have designed an interactive visualization and annotation tool as shown in Figure 11 with the following functionalities:

•

Create Polygon: Different shapes such as rectangles, circles, ellipses or straight lines can be easily drawn by mouse-drag. Any complex shape can be drawn.
•

Delete Polygon: False-positive polygons can be deleted with a simple button-click, along with any undesired ones.
•

Fragment Polygons: A convex polygon can be divided into four smaller convex polygons to get a tighter bound around regions of interest.
•

Merge Polygons: Multiple smaller neighboring convex polygons can be merged into a single convex polygon.
•

Edit Polygons: Vertices can be adjusted with mouse-drag to re-orient/reshape polygons, vertices can be removed to make the polygons simpler, and vertices can be added to form complex tighter shapes.
•

Defining Elements: The site bounds, site perimeter, subspaces and exclusion zones are marked by polygons. A linear constraint can be defined by a triangle, where the first two points indicate the cut and the third point identifies the infeasible half-space. These elements are exported in associated JSON files (as site, subspaces, zones and linear constraints).

II-C SmartScan: Front-End

SmartScan is an interactive application for automated subspace extraction from satellite images. The app has been made available on both Linux and Windows operating systems. The master-screen of SmartScan is shown in Figure 12.

The app has an input module for entering site name, and the client-provided GPS coordinates. This module has three parameters: Latitude, Longitude and Zoom depth. An example of how to enter the coordinates is shown above the field-entry boxes. On the Extract button click, a back-end process is called to extract a high-resolution satellite image from Google maps, for the given latitude, longitude and zoom parameters. The zoom will control the spatial extent of the satellite image. If the satellite image extraction fails, a message is provided in the console indicating that the user should change the zoom level and try again. The module has been tested to work for zoom levels of 19 to 21. After the satellite image is extracted, a folder with the same name as the site name will be created. This folder will host all the meta data for that site. The drop down lists will be populated with the site names in both subspace extraction and visualization modules. After the satellite image has been extracted, a user-prompt module will open to provide user-prompts (box and point). The user must provide these prompts and click save and export button to save the user-prompts in the folder. This is followed by the user selecting the site and clicking on Extract. This will invoke the Segment Anything Model to extract the mask. The mask is post-processed to generate convex polygons. Finally, the user can select from the list of cases in which the subspaces have been extracted and click on Open to begin the quality check using the annotation tool. Finally, after the quality check is completed, the user will click the Save and Export button to export the JSON files to the designated site folder.

II-D Qualitative Results

Figure 13 shows qualitative results of high-quality subspace extraction using SmartScan from significantly visually dissimilar satellite images. This shows the combined effect of few-shot capabilities of our Autonomous Prompting System, and the domain-agnostic zero-shot efficiency of SAM.

III Conclusion

We presented SmartScan, an AI-based framework that automates the extraction of pertinent data-sets necessary for optimal sensor placement design. Thus, the subspaces of interest are extracted from a satellite image of the site with an interactive tool that helps to quickly create facility-specific and problem-dependent constraint sets (including bounds, perimeter, exclusion zones and other constraints).

SmartScan employs the Segment Anything Model (SAM), a prompt-based transformer model for zero-shot segmentation for extracting sub-spaces (regions of interest) from any satellite image without need for explicit training. SmartScan is streamlined for producing high-quality sub-space extraction with high throughput and minimal human supervision (quality check) with its novel end-to-end design and AI-based prompting mechanism, thus increasing scalability and efficiency of downstream algorithms. Further, the design of SmartScan makes it suited for extracting regions of interest from any ultra high-resolution satellite imagery, making it domain agnostic. SmartScan shows the utility of accurately prompting a pre-trained segmentation model such as SAM, thereby negating the need for a fully-supervised, task-specific segmentation model trained from scratch.

IV Acknowledgments

This work would not have been possible without the support of several colleagues at Schlumberger-Doll Research during my internship. I would like to thank the Sensing and Emissions team, and in particular, Andrew Speck, Junyi Yuan, Anna Tifft, Jafet Ruiz Santana and Antonio Vieira for their unwavering help. Lastly, I would like to extend my sincere thanks to Kashif Rashid, my mentor, for his valuable ideas and immense support for the entire duration of this project.

References

[1] W. J. Collins, C. P. Webber, P. M. Cox, C. Huntingford, J. Lowe, S. Sitch, S. E. Chadburn, E. Comyn-Platt, A. B. Harper, G. Hayman et al., “Increased importance of methane reduction for a 1.5 degree target,” Environmental Research Letters, vol. 13, no. 5, p. 054003, 2018.
[2] T. R. Scarpelli, D. J. Jacob, J. D. Maasakkers, M. P. Sulprizio, J.-X. Sheng, K. Rose, L. Romeo, J. R. Worden, and G. Janssens-Maenhout, “A global gridded inventory of methane emissions from oil, gas, and coal exploitation based on national reports to the united nations framework convention on climate change,” Earth System Science Data, vol. 12, no. 1, pp. 563–575, 2020.
[3] K. Rashid, L. Zielinski, J. Yuan, and A. Speck, “Subspace-constrained continuous methane leak monitoring and optimal sensor placement,” 2023.
[4] M. Soltanieh, A. Zohrabian, M. J. Gholipour, and E. Kalnay, “A review of global gas flaring and venting and impact on the environment: Case study of iran,” International Journal of Greenhouse Gas Control, vol. 49, pp. 488–509, 2016.
[5] D. H. Cusworth, R. M. Duren, A. K. Thorpe, W. Olson-Duvall, J. Heckler, J. W. Chapman, M. L. Eastwood, M. C. Helmlinger, R. O. Green, G. P. Asner et al., “Intermittency of large methane emitters in the permian basin,” Environmental Science & Technology Letters, vol. 8, no. 7, pp. 567–573, 2021.
[6] D. Zavala-Araiza, R. A. Alvarez, D. R. Lyon, D. T. Allen, A. J. Marchese, D. J. Zimmerle, and S. P. Hamburg, “Super-emitters in natural gas infrastructure are caused by abnormal process conditions,” Nature communications, vol. 8, no. 1, p. 14012, 2017.
[7] A. R. Brandt, G. A. Heath, and D. Cooley, “Methane leaks from natural gas systems follow extreme distributions,” Environmental science & technology, vol. 50, no. 22, pp. 12 512–12 520, 2016.
[8] L. Roubeyrie and S. Celles, “Windrose: A python matplotlib, numpy library to manage wind and pollution data, draw windrose,” Journal of Open Source Software, vol. 3, no. 29, p. 268, 2018.
[9] S. Nagendra, N. Podila, R. Ugarakhod, and K. George, “Comparison of reinforcement learning algorithms applied to the cart-pole problem,” in 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, 2017, pp. 26–32.
[10] S. Nagendra, D. Kifer, B. Mirus, T. Pei, K. Lawson, S. B. Manjunatha, W. Li, H. Nguyen, T. Qiu, S. Tran et al., “Constructing a large-scale landslide database across heterogeneous environments using task-specific model updates,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 4349–4370, 2022.
[11] S. Nagendra, R. Baskaran, and S. Abirami, “Video-based face recognition and face-tracking using sparse representation based categorization,” Procedia Computer Science, vol. 54, pp. 746–755, 2015.
[12] C. Funk, S. Nagendra, J. Scott, B. Ravichandran, J. H. Challis, R. T. Collins, and Y. Liu, “Learning dynamics from kinematics: Estimating 2d foot pressure maps from video frames,” arXiv preprint arXiv:1811.12607, 2018.
[13] S. Nagendra and D. Kifer, “Patchrefinenet: Improving binary segmentation by incorporating signals from optimal patch-wise binarization,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 1361–1372.
[14] J. Liu, C. Shen, T. Pei, K. Lawson, D. Kifer, S. Nagendra, and S. Banagere Manjunatha, “A new rainfall-induced deep learning strategy for landslide susceptibility prediction,” in AGU Fall Meeting Abstracts, vol. 2021, 2021, pp. NH35E–0504.
[15] T. Pei, S. Nagendra, S. Banagere Manjunatha, G. He, D. Kifer, T. Qiu, and C. Shen, “Utilizing an interactive ai-empowered web portal for landslide labeling for establishing a landslide database in washington state, usa,” in EGU General Assembly Conference Abstracts, 2021, pp. EGU21–13 974.
[16] T. Pei, S. Nagendra, S. B. Manjunatha, G. He, T. Qiu, D. Kifer, and C. Shen, “Cloud-based interactive database management suite integrated with deep learning-based annotation tool for landslide mapping,” in AGU Fall Meeting 2020. AGU, 2020.
[17] S. Nagendra, S. Banagere Manjunatha, C. Shen, D. Kifer, and T. Pei, “An efficient deep learning mechanism for cross-region generalization of landslide events,” in AGU Fall Meeting Abstracts, vol. 2020, 2020, pp. NH030–0010.
[18] L. Zhu, P. Tilke, S. Nagendra, M. Etchebes, and M. LeFranc, “A rapid and realistic 3d stratigraphic model generator conditioned on reference well log data,” in Second EAGE Digitalization Conference and Exhibition, vol. 2022, no. 1. European Association of Geoscientists & Engineers, 2022, pp. 1–5.
[19] S. Nagendra, C. Shen, and D. Kifer, “Threshnet: Segmentation refinement inspired by region-specific thresholding,” arXiv preprint arXiv:2211.06560, 2022.
[20] ——, “Estimating uncertainty in landslide segmentation models,” arXiv preprint arXiv:2311.11138, 2023.
[21] S. Nagendra, K. Rashid, C. Shen, and D. Kifer, “Samic: Segment anything with in-context spatial prompt engineering,” arXiv preprint arXiv:2412.11998, 2024.
[22] S. Nagendra and P. Panigrahi, “Emotion recognition from the perspective of activity recognition,” arXiv preprint arXiv:2403.16263, 2024.
[23] S. Nagendra, “Thermal analysis for nvidia gtx480 fermi gpu architecture,” arXiv preprint arXiv:2403.16239, 2024.
[24] ——, “Towards designing deep learning architectures for improving semantic segmentation performance,” 2025.
[25] S. N. D. Kifer, “Supplementary material-patchrefinenet: Improving binary segmentation by incorporating signals from optimal patch-wise binarization.”
[26] P. Cp, S. Narayanan, V. S. Kommuri, S. Subramanian, K. Bijlani, S. Nagendra, and N. Podila, “Icacci-02 (a): Artificial intelligence and machine learning/data engineering/biocomputing (regular papers).”
[27] A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo et al., “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
[28] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, and T. Unterthiner, “Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
[29] M. Bucher, T.-H. Vu, M. Cord, and P. Pérez, “Zero-shot semantic segmentation,” Advances in Neural Information Processing Systems, vol. 32, 2019.
[30] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16 000–16 009.
[31] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
[32] Y. Li, H. Mao, R. Girshick, and K. He, “Exploring plain vision transformer backbones for object detection,” in European Conference on Computer Vision. Springer, 2022, pp. 280–296.
[33] M. Tancik, P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ramamoorthi, J. Barron, and R. Ng, “Fourier features let networks learn high frequency functions in low dimensional domains,” Advances in Neural Information Processing Systems, vol. 33, pp. 7537–7547, 2020.
[34] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning. PMLR, 2021, pp. 8748–8763.
[35] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European conference on computer vision. Springer, 2020, pp. 213–229.
[36] B. Cheng, A. Schwing, and A. Kirillov, “Per-pixel classification is not all you need for semantic segmentation,” Advances in Neural Information Processing Systems, vol. 34, pp. 17 864–17 875, 2021.
[37] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
[38] A. El-Rabbany, Introduction to GPS: the global positioning system. Artech house, 2002.
[39] G. Maps, “Client Libraries for Google Maps Web Services | Google Maps Web Service APIs | Google for Developers — developers.google.com,” https://developers.google.com/maps/web-services/client-library, [Accessed 27-08-2023].
[40] C. Sutton, A. McCallum et al., “An introduction to conditional random fields,” Foundations and Trends® in Machine Learning, vol. 4, no. 4, pp. 267–373, 2012.
[41] A. Saalfeld, “Topologically consistent line simplification with the douglas-peucker algorithm,” Cartography and Geographic Information Science, vol. 26, no. 1, pp. 7–18, 1999.