0% found this document useful (0 votes)

81 views8 pages

Monocular Depth for 3D Gaussian Splatting

The document presents Mode-GS, a novel rendering algorithm that utilizes anchored Gaussian splats for improved scene rendering in ground-robot trajectory datasets. This method addresses challenges such as splat drift and scale ambiguity by integrating monocular depth estimation and employing a scale-consistent depth calibration technique. Mode-GS achieves state-of-the-art rendering performance on various datasets, demonstrating robustness in scenarios with free trajectory patterns and limited multi-view observations.

Uploaded by

baimaxuanbi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views8 pages

Monocular Depth for 3D Gaussian Splatting

Uploaded by

baimaxuanbi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Mode-GS: Monocular Depth Guided Anchored 3D Gaussian Splatting

for Robust Ground-View Scene Rendering

Yonghan Lee1,2 , Jaehoon Choi1 , Dongki Jung1 , Jaeseong Yun2 ,
Soohyun Ryu2 , Dinesh Manocha1 , and Suyong Yeon2

Abstract— We present a novel-view rendering algorithm, Ours

Mode-GS, for ground-robot trajectory datasets. Our approach Camera IMU LiDAR
is based on using anchored Gaussian splats, which are designed
Multi-Sensor Odometry
to overcome the limitations of existing 3D Gaussian splatting
algorithms. Prior neural rendering methods suffer from severe
arXiv:2410.04646v1 [[Link]] 6 Oct 2024

splat drift due to scene complexity and insufficient multi-view

observation, and can fail to fix splats on the true geometry
in ground-robot datasets. Our method integrates pixel-aligned
3DGS
anchors from monocular depths and generates Gaussian splats
Trajectory Data (Images / Poses) Anchored Gaussian Splatting
around these anchors using residual-form Gaussian decoders.
To address the inherent scale ambiguity of monocular depth, we A

parameterize anchors with per-view depth-scales and employ

scale-consistent depth loss for online scale calibration. Our
method results in improved rendering performance, based
on PSNR, SSIM, and LPIPS metrics, in ground scenes with
free trajectory patterns, and achieves state-of-the-art rendering
performance on the R3 LIVE odometry dataset and the Tanks
Anchor Initialization (Monocular Depth)
and Temples dataset.

Fig. 1. Our Mode-GS integrates monocular depth estimation with anchored

I. I NTRODUCTION Gaussian splatting, uses a scale-consistent depth calibration technique and
residual-based Gaussian decoders. By incorporating dense pixel-aligned
The development of navigation and perception algorithms anchor points from monocular depth, anchored splatting improves robustness
for autonomous robots typically requires extensive and costly in scenarios without dense multi-view images and mitigates the impact
field experiments for training and validation. In this context, of inaccurate poses in complex ground-view scenes. Our method can be
developed using multi-sensor odometry poses in a point-cloud-free setting.
neural rendering offers a practical solution, as it can signif- Overall, it offers a practical and robust rendering pipeline for ground-view
icantly reduce the time and effort based on data simulation robotic datasets, as shown in Section V.
and augmentation [1], [2], [3]. Specifically, neural rendering
can be used to generate novel view images by learning
a neural scene representation from a set of input training view information; (2) the difficulty of obtaining pixel-level
images and corresponding poses [4], [5]. accurate trajectory poses [8], [9], [10].
Current neural rendering research is based on two pop- First of all, 3DGS requires dense point clouds for splat
ular methods: implicit Neural Radiance Fields (NeRF) [4] initialization and informative multi-view photometric gra-
and explicit 3D Gaussian Splatting (3DGS) [5], [6]. NeRF dients for Adaptive Density Control (ADC) [5] to expand
gained popularity because of its high-fidelity and continuous splats into unoccupied areas, which significantly deteriorates
scene representations due to the implicit nature of radiance its performance on ground-robot datasets. Since training a
fields. However, it is hard to scale to large scenes because neural rendering algorithm heavily relies on dense multi-
of the limited representational capacity of its coordinate- view observations due to its inherent high dimensionality,
based Multi-Layer Perceptron (MLP), which cannot effi- previous neural rendering approaches have largely focused
ciently handle the cubic growth in scene complexity. On the on datasets with structured viewing patterns—such as aerial
other hand, 3DGS provides a feasible alternative for scene- (top-down view) [11], [12],object-centric (inward view) [13],
scale rendering of ground-view robot datasets by explicitly [14], and street (forward motion) datasets [15]. However,
representing only the non-empty parts of the scene, and there is relatively less work on ground-robot datasets with
using more interpretable Gaussian splats as scene primitives. free trajectories [16].
It turns out that complex ground-view robot datasets with The second challenge in ground-view rendering is the
free trajectories [7] present two fundamental challenges sensitivity of 3DGS to pixel-level pose accuracy of training
with respect 3DGS algorithms: (1) the scarcity of multiple- images, which is crucial due to its reliance on pixel-based
photometric loss. Acquiring pixel-accurate poses in ground-
1 University of Maryland, 8125 Paint Branch Dr, College Park, MD
view datasets is difficult. Structure from Motion (SfM) [17]
20742, USA.
2 NAVER LABS, 95 Jeongjail-ro, Bundang-gu, Seongnam-si, Gyeonggi- or vision-based SLAM methods [18] often struggle to con-
do, South Korea. sistently estimate poses in ground-robot datasets without
Training Progress ambiguous monocular depth networks to initialize pixel-
aligned splats. Through our anchor depth-scale param-
eterization and scale-consistent depth loss, we achieve
consistent depth calibration during training, eliminating
the need for initial SfM or LiDAR point clouds.
• Anchor-Decoder Structure with Residual-Form
MLP Decoder: We present an advanced anchor-decoder
structure for 3DGS, featuring our proposed residual-
form Gaussian decoder. This allows for the direct initial-
ization of anchored Gaussian splat attributes, improving
both the efficiency and accuracy of the scene training
process.
• Novel-view-synthesis from Ground-Robot Dataset:
Our method shows robust rendering performance on
ground-robot datasets with free trajectory patterns,
Sequential Type Non-Anchored Type Anchored Type achieving state-of-the-art rendering performance on the
(MonoGS) (Original 3DGS) (Ours)
R3 LIVE odometry dataset [16] even without LiDAR
Fig. 2. We compare the degenerate training patterns of 3DGS in scenarios point clouds, while maintaining comparable perfor-
without dense multi-view information. The patterns are categorized accord- mance on the Tanks and Temples dataset. Note that our
ing to their type: (a) Sequential Type: SLAM-based Gaussian splatting algorithm can be built on easily obtainable odometry
utilizes sequential information by processing consecutive images with pose
refinement, initially generating sharper images. However, their pose tends poses in a point-cloud free setting, providing practical
to drift and eventually diverges; (b) Non-Anchored Type: In the original and robust rendering pipeline for ground-view robot
3DGS and their variants with ADC, the splats tend to drift from the true datasets. We highlight the improvements over prior
geometry without dense multi-view photometric information; (c) Anchored
Type: Anchoring effectively prevents splats from becoming detached from methods in terms of rendering metrics (PSNR, SSIM,
the actual geometry. and LPIPS) on these datasets in Section V.

II. R ELATED W ORKS

fragmenting or diverging, when images lack salient features
or textures to track. While multi-sensor SLAM and odometry Neural Rendering for Robot Navigation Neural rendering
methods [19], [20], [16] are more reliable for trajectory [4], [5] can achieve remarkable photo-realistic rendering
pose estimation, they frequently fail to achieve pixel-level quality. This photorealistic rendering can be applied to vari-
accuracy due to heterogeneous sensor configurations and ous robotics applications such as navigation [1], [2], [25],
sensor fusion. Due to sensitivity of 3DGS on pose accuracy, robot data simulation [3], [26], and robotic teleoperation
the SLAM algorithms that directly integrate 3D Gaussian [27]. Prior works present the robot navigation methods
splats [21], [22], [23] as scene representation often fail to designed for 3D environment represented as NeRF [1], [25]
reliably estimate complete trajectories. or 3D Gaussian Splats [2]. UAV-Sim [3] and PEGASUS
Main Results: We present a novel rendering approach, [26] utilize neural rendering to synthesize data for aerial
Mode-GS, to address these challenges in ground-robot perception or robotic manipulation.
datasets. Our method integrates monocular depth networks 3D Gaussian Splatting with Geometric Prior 3D Gaussian
with an anchored Gaussian splat generation, incorporating Splatting (3DGS) [5] has received considerable attention.
a scale-consistent depth calibration mechanism. By utilizing 3DGS utilizes 3D Gaussian splats [6] as 3D primitives which
monocular depths, we initialize pixel-aligned anchors that can be rendered thorough sorting and rasterization. Scaffold-
fully cover the frustums across all input images, effectively GS [24] introduces anchors to cluster adjacent 3D Gaussians
preventing drift of splats caused by degenerate densification and uses an Multi-Layer Perceptron (MLP) to predict their
in the absence of sufficient multi-view photometric cues (Fig. attributes. However, since 3D Gaussian splats rely solely
2). Being inspired by [24], we design Residual-form Gaus- on photometric constraints, they often violate geometric
sian Decoder to robustly generate Gaussian splats around coherence. Several studies [28], [29] focusing on surface
these anchors. Unlike the decoder structure in [24], our extraction from Gaussian splats have addressed this issue by
novel decoder enables direct initialization of splat attributes aligning the flat 3D Gaussian splats with the geometry. 2DGS
(e.g. color) and greatly improves the training efficiency [30] directly projects 3D Gaussian onto flat 2D Gaussian
thanks to the efficient residual structure. Lastly, the inherent for effective mesh extraction. Gaussian Opacity Fields [31]
scale ambiguity of monocular depth is mitigated during the extract surfaces using ray-tracing based volume rendering. To
training by our Anchor Depth-Scale Parameterization and consider geometry regularization, prior methods either used
Scale-Consistent Depth Loss, leading to the consistent depth monocular depth estimation [32], [33] or multi-view stereo
calibration. Our novel contributions include: [34], [35] for designing training strategies.
• Scale-consistent Integration of Monocular Depths: 3D Gaussian Splatting with SLAM All of previous meth-
We introduce a novel approach that integrates scale- ods depend on precise poses obtained from a Structure
from Motion system [17], which require significant compu- Gaussian Splat Generation, and 3) Training from Rendering
tational resources. Recent research has expanded the 3DGS Losses. Instead of directly initializing Gaussian splats, we
by incorporating various SLAM systems, including those initialize anchor points with embedded features and nominal
using monocular camera [21] or RGB-D sensors [22].The Gaussian attributes, and generate child Gaussian splats by
aforementioned methods are limited to small-scale scenes combining the nominal attributes with the residuals decoded
that allow for the collection of dense viewpoints and cover from the Gaussian decoders. Note that our pipeline does
large areas of the environment. Additionally, CF-3DGS [8] not require initial SfM or LiDAR point clouds, but built on
and InstantSplat [36] present end-to-end frameworks for joint easy-to-obtain odometry poses and corresponding images,
novel view synthesis and camera pose estimation from se- which increases the practicality of our approach for real-
quential images. However, none of these methods effectively world datasets.
estimate a reliable camera trajectory in ground-view robot
datasets for training the Gaussian Splatting technique. A. Per-View Anchor Initialization
Monocular Anchor Initialization Instead of utilizing input
III. P RELIMINARY
SfM or LiDAR point clouds, we utilize monocular depth net-
3DGS [5] is a differentiable rendering method, which works to initialize anchor points P fused from each training
can be trained from n images {I1 , I2 , · · · , In } to learn view i. Firstly, monocular depth images [37] {Di }1:n are
volumetric scene representation G as a mixture of anisotropic generated from training images {Ii }1:n . With given odometry
Gaussian primitives {G1 , G2 , · · · , Gm }. From the trained poses {(Ri , ti )}1:n with Ri ∈ SO(3), ti ∈ R3 . depth images
scene representation G, tile-based differentiable rasterizer R D1:n are unprojected to the 3D space. The 3D points are
can render image I for novel view T ∈ SE(3) as I = then voxelized with small resolution ϵ to remove redundant
R(G | T ). anchor points, generating a cluster of per-view anchor points
Gaussian primitives Gi are initialized from input point P = {Pi }i∈1:n .
cloud with means µi as the corresponding point positions. Anchor Depth-Scale Parameterization Each set of per-
As such, each Gaussian G(x) is represented as, view anchor points Pi has inherent scale-ambiguity which

1
needs to be calibrated for multi-view consistency. To allow
T −1
G(x) = exp − (x − µ) Σ (x − µ) (1) depth-scale adjustment of each per-view anchor points Pi , we
2
introduce scale parameter ŝi for each view i. For each point
where x is arbitrary position in 3D space and Σ denotes p ∈ Pi , the scale parameter is applied as pW = ŝi Ri pC + ti ,
covariance of Gaussian kernel. Here, Σ is decomposed into to transforming the point from camera coordinates C to
scaling matrix S and rotation matrix R to preserve semi the world coordiate W . This scale parameterization of each
positive-definite form, Σ = RSS T RT Additionally, each anchor points group allows online depth-scale adjustment
Gaussian stores an opacity value oi , which is multiplied with during the training time, to mitigate inherent scale-ambiguity
G(x) as αi (x) = Gi (x)oi for alpha-blending weight αi , and problem of monocular depth images.
the view-dependent color ci represented by spherical har- In complex ground scenes, SfM or LiDAR point cloud
monics (SH). 3D Gaussians are projected into 2D space [6], data often contains missing 3D structures due to frequent
and the color C is computed through volumetric rendering, occlusion and lack of observation. This leads to signifi-
following a front-to-back depth order: cant drifts in 3DGS algorithms when sufficient multi-view
X i−1
Y observations are unavailable, caused by degenerate cloning
C= ci αi (1 − αi ) (2) through ADC in 3DGS (Fig. 2). Our monocular initialization
i∈N j=1 produces pixel-aligned, fully covered initial point clouds,
where N is the set of sorted Gaussians overlapping with the while addressing per-view depth-scale ambiguity through the
given pixel. proposed parameterization and scale-consistent depth loss
that is further explained in the Sec. IV-C.
IV. M ETHODS
Our method aims to enabling the rendering of novel view B. Anchored Gaussian Splat Generation
images from ground-view robot trajectory datasets, which Residual-Form Gaussian Decoder We define our Residual-
is challenging for the original 3DGS algorithms due to Form Gaussian Decoders to generate k Gaussian splats from
complexity of ground-view scenes and lack of multi-view each anchor. Each anchor point pj ∈ P is associated with
observations with free trajectory pattern (Fig. 2). Our ground- a feature descriptor fj ∈ R32 , nominal Gaussian splat
view rendering pipeline integrates monocular depth networks attributes, including position µ̄j ∈ R3 , color c̄j ∈ R3 , opacity
[37] into anchored Gaussian splatting [24] scheme with the ōj ∈ R, and scaling s̄j ∈ R3 for covariance composition.
proposed scale-consistent depth calibration framework. This For clarity, we slightly modify our notation such that µ̄j
approach leverages pixel-aligned splat initialization from represents the position of the anchor point, while pj denotes
monocular depth networks, thereby further improving the the point itself. We set nonominal values of covariance
robustness of the anchored Gaussian structure [24]. related rotation r̄j as identity quaternions. Our residual
As can be seen in Fig. 3, our pipeline consists of three decoders Fµ , Fo , Fc , Fs , Fr are defined for each Gaussian
main steps: 1) Per-View Anchor Initialization, 2) Anchored attribute α ∈ {µ, o, c, s, r} lightweight 2-layer Multi-Layer
Monocular Anchor Initialization Anchored Gaussian Splat Generation Generated Gaussian Splats 𝑳𝒑𝒉𝒐𝒕𝒐
Anchor Depth-Scale Parameterization
𝜇! 𝜇":$
View 1

Gaussian Renderer
View 2 𝑐#! Anchor Attributes
+
Splat Attributes 𝑐":$
Image View 3 𝑠#! 𝑠":$ Render Image
X X 𝑟":$ X X

residual
𝑟#!
freeze

Monocular X ∆𝜇":$ X
𝑜#! 𝑜":$
Depth X
Prediction
X ∆𝑐":$
𝑓! 𝑫𝒆𝒄𝝁
∆s":$
∆𝑟":$
X ∆𝑜":$ X
𝑫𝒆𝒄𝒄
Depth XXX : Anchors Residual-Form Gaussian Decoder : Gaussian Splats
Render Depth
𝑳𝒅𝒆𝒑𝒕𝒉
× Depth-Scale Parameter (λ+)
Scale-Consistent Depth Loss

Fig. 3. Our methods consists of three main steps: (a) Per-View Anchor Initialization: Given monocular depth images, depth-scale adjustable anchors
are initialized from each view. Each anchor is fixed in the 3D scene except the depth-scale toward the corresponding view. (b) Anchor Decoding with
Residual-Form Gaussian Decoder: Each anchor is decoded into k Gaussian splats by our residual-form Gaussian decoders. When initialized, each anchor
contains nominal Gaussian splat attributes (µ̄j , r̄j , c̄j , ōj , s̄j ) and an embedded feature fj . The residual decoders generate k sets of residual attributes for
child splats, which are combined with nominal anchor attributes to generate child Gaussian splats. (c) Training with Scale-Consistent Depth Loss Online
Depth-Scale Calibration: We use scale-consistent depth loss Ldepth that incorporates scales for each monocular depth supervision.

Perceptron (MLP) structures. During training and rendering, Scale-Consistent Depth Loss Monocular depth images D
the decoders generates on-the-fly the residuals ∆αi from inherently contain scale ambiguity, and therefore need to
nominal attribute ᾱi stored in each anchor pj , as follows: be calibrated with adequate scale parameters when they
are used for depth supervision. Unlike previous approaches
{∆α0 , ∆α1 , ..., ∆αk−1 } = Fα (fj ) (3) that employ depth losses based on scale-invariant Pearson
Correlation [33], we define our depth loss term with a depth-
The use of residual-form decoders along with nominal scale parameter λ̂i embedded for each monocular depth Di ,
attributes (µ̄j , c̄j , ōj , s̄j ) enables faster training of decoders as follows:
and direct initialization of splat attributes, offering significant
advantages over other anchored Gaussian methods [38], [24]. n
X
Anchored Gaussian Generation k child Gaussian splats Ldepth = log ||1 + (λ̂i Di − D̂i )||2 (9)
are spawned from each anchor pj by combining the decoded i=1

residual attributes {α}1:k and nominal attributes ᾱj . This is Note that this depth-scale parameter λ̂i differs from the
expressed as: scale parameter ŝi introduced in our anchor depth-scale
parameterization. Specifically, ŝi allows to each initialized
µ1:k = µ̄j + ∆µ1:k (4) per-view anchor group to adjust its scale toward the reference
view, while λ̂i corrects the monocular depth scale ambiguity
c1:k = c̄j + ∆c1:k (5)
during the loss calibration.
s1:k = s̄j · ∆s1:k (6) Full Loss Design In addition to our proposed scaled-depth
o1:k = ōj + ∆o1:k (7) loss, the full loss function consists of a photometric loss
(8) Lphoto , a volumetric regularization loss Lvol [39], and an
anisotorpic regularization loss Laniso [40]. For completeness,
As shown in Fig. 5(a), our residual attribute structure we list these loss functions below:
enables direct initialization of splat attributes α1:k (e.g.
color) by incorporating the reference value ᾱ. This approach
accelerates the training of decoders, as they only need to Lphoto = w · D-SSIM(Ii , Iˆi ) + (1 − w) · ||Ii − Iˆi ||1 (10)
learn the deviations from the reference value. By addressing
In this equation, Lphoto represents a combination of L1 loss
one of the main weaknesses of the anchor-decoder scaffold
and D-SSIM loss [5], where SSIM stands for the Structural
structure [24], [38], our method improves both the training
Similarity Index Measure . The weight parameter w controls
efficiency and robustness of the original framework.
the balance between the two loss components.
C. Training from Rendering Losses
The generated Gaussian splats provide an explicit 3D
X
Lvol = Prod(si ) (11)
scene representation that can be rendered into novel view p∈P
color images and depth images, I,ˆ D̂ using the tile-based
1 X
rasterizer. We use the color image I and monocular depth Laniso = max{max(sp )/min(sp ), r} − r (12)
|P|
image D as supervision during training. p∈P
(hku_main_building)
Main Building
(hku_campus_seq_00)
Campus

3DGS GOF Scaffold-GS Ours GT

Fig. 4. Qualitative comparison on two scenes from the R3 LIVE dataset. Non-anchored methods, such as 3DGS [5] and GOF [31], exhibit significant
splat drift in the absence of dense multi-view information in sparsely captured scenes. In contrast, both Scaffold-GS [24] and our method demonstrate
robust performance due to their use of anchored splatting. Our approach delivers sharper and more accurate results, attributed to fast training from the
direct initialization of splat attributes and dense, pixel-aligned anchor initialization from monocular depth estimation.

Both Lvol and Laniso are applied to splats P to regu- distortion loss. (2) Sequential methods, including Colmap-
larize their shape. Here, Prod means the multiplication of Free 3D Gaussian Splatting (CF-3DGS) [8] and MonoGS
covariance-related Gaussian scales [40]. The overall loss [21], process each frame sequentially to fully exploit local
function is formulated as: information and refine image poses. (3) Anchored methods,
such as Scaffold-GS [24] and our approach, generate splats
anchored to points with restricted movement in 3D space.
L = λp Lphoto + λs Lscale + λd Ldepth + λu Laniso (13) Finally, (4) Baseline include the original 3DGS [5] and Mip-
Here, λp , λs , λd , and λu represent the weighting factors Splatting [43].
assigned to each corresponding loss term.
A. Rendering evaluation on R3 LIVE dataset
V. E XPERIMENT R3 LIVE dataset is a publicly available odometry dataset
We evaluate our method on four challenging ground-view captured by a hand-held device with a 15 Hz camera, 200
scenes: one indoor and one outdoor scene each from the Hz Inertial Measurement Unit (IMU), and 10 Hz Livox
R3 LIVE odometry dataset [16] and the Tanks and Temples Avia LiDAR sensor. It includes diverse indoor and outdoor
dataset [14]. To demonstrate the performance of our method, scenes from the campuses of HKU and HKUST. Unlike
we report rendering metrics including PSNR, SSIM [41], typical neural rendering datasets with object-centric views or
and LPIPS [42], which are widely used in neural rendering structured viewing patterns [14], [11], [13], R3 LIVE dataset
benchmarks [7], [5]. Our algorithm utilized no input point captures many complex indoor and outdoor structures with
clouds in any of the scenes, while we used LiDAR point free trajectory patterns.
cloud for R3 LIVE and SfM point cloud for Tanks and We process IMU, LiDAR, and Image data using R3 LIVE
Temples dataset. [16] multi-sensor odometry pipeline to generate pose-tagged
For effective comparison and analysis, we categorized image sequences. To synchronize pose estimation with image
existing 3DGS variants into four types according to the time stamps, we slightly modified the R3 LIVE odometry
the characteristics of regularization or information that they implementation. However, it still inevitably introduces pixel-
utilize: (1) Geometric 3DGS methods include 2D Gaus- level errors due to sensor fusion and inaccurate extrinsic cal-
sian Splatting (2DGS) [30] and Gaussian Opacity Field ibration. To avoid redundancy from high-frame rate images,
(GOF) [31]. To align the splats with the actual geometry, we subsample the images by selecting every 10th frame from
2DGS constraint splats to be flat and GOF applies depth the dataset.
TABLE I

With Residual-Form
R ESULTS ON R3 LIVE DATASET

Decoder
Main Building Campus
Method Type Input
PSNR ↑ SSIM ↑ LPIPS ↓ PSNR ↑ SSIM ↑ LPIPS ↓
3DGS [5] Baseline LiDAR 15.84 0.684 0.497 21.60 0.760 0.364
Mip-Splatting [43] Baseline LiDAR 13.58 0.630 0.499 15.89 0.679 0.449

With Direct-Form
CF-3DGS [8] Sequential Mono – – – – – –

Decoder
MonoGS [21] Sequential Mono 11.60 0.486 0.557 16.06 0.600 0.527
GOF [31] Geometric LiDAR 15.53 0.676 0.502 19.78 0.734 0.478
2DGS [30] Geometric LiDAR 15.72 0.674 0.507 20.67 0.743 0.418
Scaffold-GS [24] Anchored LiDAR 17.01 0.697 0.495 20.94 0.756 0.419
Ours w/ mono Anchored Mono 17.27 0.703 0.470 22.98 0.774 0.365
(a) (b)
Fig. 5. (a) With Residual-Form Gaussian Decoder (top) , only residual
TABLE II from nominal color is estimated and trained by the decoder, allowing
direct color initialization and fast training. Direct-Form Gaussian Decoder
R ESULTS ON TANKS AND T EMPLES DATASET (bottom) [24], [38] does not allow color initialization due to its on-the-
Method Type Input
Main Building Campus fly decoding scheme. (b) Rendering Performance (PSNR) ablation between
PSNR ↑ SSIM ↑ LPIPS ↓ PSNR ↑ SSIM ↑ LPIPS ↓ Direct-Form Color MLP and Residual-Form Color MLP.
3DGS [5] Baseline SfM 15.92 0.689 0.396 16.86 0.639 0.435
Mip-Splatting [43] Baseline SfM 15.44 0.683 0.366 16.39 0.646 0.417
TABLE III
CF-3DGS [8] Sequential Mono – – – 15.53 0.614 0.510
MonoGS [21] Sequential Mono 9.78 0.453 0.637 12.01 0.488 0.591 A BLATION OF EACH PROPOSED MODULE
GOF [31] Geometry SfM 15.49 0.680 0.377 16.56 0.636 0.459
2DGS [30] Geometry SfM 16.57 0.705 0.382 17.20 0.636 0.455 depth cal. res. MLP PSNR ↑ SSIM ↑ LPIPS ↓
Scaffold-GS [24] Anchored SfM 17.12 0.719 0.345 17.42 0.677 0.413
Ours w/ mono Anchored Mono 15.70 0.682 0.442 16.66 0.641 0.462 16.84 0.687 0.475
✓ 16.91 0.693 0.467
✓ 17.03 0.688 0.465
✓ ✓ 17.27 0.703 0.470
As shown in Table I, our method outperforms all state-
of-the-art 3DGS variants in terms of rendering performance.
Notably, all algorithms except the Anchored variants exhibit
significantly lower performance in the main building scene, is the anchor-based Scaffold-GS [24], validating our analysis
which presents considerable challenges due to the complexity of the degeneracy patterns associated with different algorithm
of its narrow corridors and hallways. In both scenes, our al- types (Fig. 2).
gorithm achieves state-of-the-art rendering performance even As shown in Table. II, our method still shows comparable
without using initial LiDAR point clouds. Monocular depth performance to other 3DGS methods. In this sense, Our
based scene initialization delivers much better or comparable method achieves a suitable balance between robustness to
performance to LiDAR based intiailzation in both scenes. limited multi-view information and high performance in
This is largely due to the inherent incompleteness of LiDAR densely captured datasets, demonstrating the most stable
point clouds in such complex environments. Our results performance across all the datasets from R3 LIVE and Tanks
demonstrate that effectively integrating monocular depths and Temples datasets.
can be more beneficial in these challenging scenes, directly C. Ablation Studies
generating pixel-aligned and complete point clouds. We evaluated our depth calibration framework and
B. Rendering evaluation of Tanks and Temples dataset residual-form Gaussian MLP in Table III. As shown in the
ablation results, each proposed module contributes to the
We also validate our method on the Tanks and Temples overall improvement in rendering performance. Additionally,
dataset, a widely recognized benchmark for neural rendering our residual-form Gaussian decoder enables fast initialization
evaluation. We select the courthouse scene and the meeting of Gaussian attributes, as illustrated in Fig. 5(a). For this
room scenes as they are geometrically most challenging in ablation, we used a direct-form MLP that generates color
the benchmarks [31]. Similar to the evaluation of R3 LIVE attributes from features, similar to [24], [38]. Compared
dataset, we subsample every 10th images. Instead of LiDAR to this direct-form MLP decoder, our proposed decoder
point clouds, we directly use SfM PCD for the other variants. significantly accelerates the training process (Fig. 5(b)).
Unlike the R3 LIVE odometry dataset, the trajectories in
this dataset follow object-centric or circular patterns, provid- VI. C ONCLUSIONS , L IMITATIONS , AND F UTURE W ORK
ing relatively dense observations to each part of scene. Due In this paper, we presented Mode-GS, a novel 3DGS
to this reason, our method does not achieve the best perfor- algorithm designed for robust neural rendering from ground-
mance. In these viewing patterns, the initial point cloud has robot trajectory datasets. Our algorithm introduces a practical
less impact on 3DGS algorithms, as splat cloning and split- rendering pipeline for ground-view robot datasets, utilizing
ting are effectively guided by salient multi-view photometric easily obtainable odometry poses and operating in a point-
gradients. It has even been shown that random initialization cloud-free setting. However, our approach is less effective in
can yield plausible results in such cases [44]. As monocular scenarios where extensive multi-view data is available, such
depth usually contains inevitable inner distortion that can not as in densely captured, object-centric datasets. Future work
be adjusted by scale factor, anchoring on these depths can be will focus on developing a hybrid approach that integrates
detrimental when enough photometric information is given. our method with non-anchored splats to achieve optimal
Nonetheless, the best-performing algorithm in this scenario performance.
R EFERENCES [22] N. Keetha, J. Karhade, K. M. Jatavallabhula, G. Yang, S. Scherer,
D. Ramanan, and J. Luiten, “SplaTAM: Splat Track & Map 3D
[1] M. Adamkiewicz, T. Chen, A. Caccavale, R. Gardner, P. Culbertson, Gaussians for Dense RGB-D SLAM,” in Proceedings of the IEEE/CVF
J. Bohg, and M. Schwager, “Vision-Only Robot Navigation in a Neural Conference on Computer Vision and Pattern Recognition (CVPR),
Radiance World,” IEEE Robotics and Automation Letters, vol. 7, no. 2, 2024.
pp. 4606–4613, 2022. [23] C. Yan, D. Qu, D. Xu, B. Zhao, Z. Wang, D. Wang, and X. Li,
[2] T. Chen, O. Shorinwa, J. Bruno, J. Yu, W. Zeng, K. Nagami, P. Dames, “GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting,” in
and M. Schwager, “Splat-Nav: Safe Real-Time Robot Navigation in Proceedings of the IEEE/CVF Conference on Computer Vision and
Gaussian Splatting Maps,” arXiv preprint arXiv:2403.02751, 2024. Pattern Recognition (CVPR), 2024.
[3] C. Maxey, J. Choi, H. Lee, D. Manocha, and H. Kwon, “UAV-Sim: [24] T. Lu, M. Yu, L. Xu, Y. Xiangli, L. Wang, D. Lin, and B. Dai,
NeRF-based Synthetic Data Generation for UAV-based Perception,” “Scaffold-GS: Structured 3D Gaussians for View-Adaptive Render-
arXiv preprint arXiv:2310.16255, 2023. ing,” in Proceedings of the IEEE/CVF Conference on Computer Vision
[4] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoor- and Pattern Recognition (CVPR), 2024.
thi, and R. Ng, “NeRF: Representing Scenes as Neural Radiance Fields [25] Q. Liu, H. Xin, Z. Liu, and H. Wang, “Integrating Neural Radiance
for View Synthesis,” Communications of the ACM, vol. 65, no. 1, pp. Fields End-to-End for Cognitive Visuomotor Navigation,” IEEE Trans-
99–106, 2021. actions on Pattern Analysis and Machine Intelligence, 2024.
[5] B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, “3D Gaussian [26] L. Meyer, F. Erich, Y. Yoshiyasu, M. Stamminger, N. Ando, and Y. Do-
Splatting for Real-Time Radiance Field Rendering,” ACM Transac- mae, “PEGASUS: Physically Enhanced Gaussian Splatting Simulation
tions on Graphics, vol. 42, no. 4, pp. 1–14, 2023. System for 6DOF Object Pose Dataset Generation,” arXiv preprint
[6] M. Zwicker, H. Pfister, J. Van Baar, and M. Gross, “EWA splatting,” arXiv:2401.02281, 2024.
IEEE Transactions on Visualization and Computer Graphics, vol. 8, [27] V. Patil and M. Hutter, “Radiance Fields for Robotic Teleoperation,”
no. 3, pp. 223–238, 2002. arXiv preprint arXiv:2407.20194, 2024.
[7] P. Wang, Y. Liu, Z. Chen, L. Liu, Z. Liu, T. Komura, C. Theobalt, and [28] A. Guédon and V. Lepetit, “SuGaR: Surface-Aligned Gaussian Splat-
W. Wang, “F2-NeRF: Fast Neural Radiance Field Training with Free ting for Efficient 3D Mesh Reconstruction and High-Quality Mesh
Camera Trajectories,” in Proceedings of the IEEE/CVF Conference on Rendering,” in Proceedings of the IEEE/CVF Conference on Computer
Computer Vision and Pattern Recognition (CVPR), 2023. Vision and Pattern Recognition (CVPR), 2024, pp. 5354–5363.
[8] Y. Fu, S. Liu, A. Kulkarni, J. Kautz, A. A. Efros, and X. Wang, [29] H. Chen, C. Li, and G. H. Lee, “NeuSG: Neural Implicit Surface
“COLMAP-Free 3D Gaussian Splatting,” in Proceedings of the Reconstruction with 3D Gaussian Splatting Guidance,” arXiv preprint
IEEE/CVF Conference on Computer Vision and Pattern Recognition arXiv:2312.00846, 2023.
(CVPR), 2024. [30] B. Huang, Z. Yu, A. Chen, A. Geiger, and S. Gao, “2D Gaussian
[9] C. Liu, S. Chen, Y. Bhalgat, S. Hu, Z. Wang, M. Cheng, V. A. Splatting for Geometrically Accurate Radiance Fields,” in ACM SIG-
Prisacariu, and T. Braud, “GSLoc: Efficient Camera Pose Refinement GRAPH 2024 Conference Papers, 2024, pp. 1–11.
via 3D Gaussian Splatting,” arXiv preprint arXiv:2408.11085, 2024. [31] Z. Yu, T. Sattler, and A. Geiger, “Gaussian Opacity Fields: Effi-
[10] L. Zhao, P. Wang, and P. Liu, “BAD-Gaussians: Bundle Adjusted cient Adaptive Surface Reconstruction in Unbounded Scenes,” arXiv
Deblur Gaussian Splatting,” arXiv preprint arXiv:2403.11831, 2024. preprint arXiv:2404.10772, 2024.
[11] H. Turki, D. Ramanan, and M. Satyanarayanan, “Mega-NERF: Scal- [32] B. Zhang, C. Fang, R. Shrestha, Y. Liang, X. Long, and P. Tan,
able Construction of Large-Scale NeRFs for Virtual Fly-Throughs,” “RaDe-GS: Rasterizing Depth in Gaussian Splatting,” arXiv preprint
in Proceedings of the IEEE/CVF Conference on Computer Vision and arXiv:2406.01467, 2024.
Pattern Recognition (CVPR), 2022. [33] M. Turkulainen, X. Ren, I. Melekhov, O. Seiskari, E. Rahtu, and
[12] Y. Xiangli, L. Xu, X. Pan, N. Zhao, A. Rao, C. Theobalt, B. Dai, and J. Kannala, “DN-Splatter: Depth and Normal Priors for Gaussian
D. Lin, “BungeeNeRF: Progressive Neural Radiance Field for Ex- Splatting and Meshing,” arXiv preprint arXiv:2403.17822, 2024.
treme Multi-scale Scene Rendering,” in Proceedings of the European [34] K. Cheng, X. Long, K. Yang, Y. Yao, W. Yin, Y. Ma, W. Wang, and
Conference on Computer Vision (ECCV), 2022. X. Chen, “GaussianPro: 3D Gaussian Splatting with Progressive Prop-
[13] J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman, agation,” in Forty-first International Conference on Machine Learning,
“Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields,” 2024.
in Proceedings of the IEEE/CVF Conference on Computer Vision and [35] Z. Li, S. Yao, Y. Chu, A. F. Garcia-Fernandez, Y. Yue, E. G. Lim,
Pattern Recognition (CVPR), 2022. and X. Zhu, “MVG-Splatting: Multi-View Guided Gaussian Splatting
[14] A. Knapitsch, J. Park, Q.-Y. Zhou, and V. Koltun, “Tanks and Temples: with Adaptive Quantile-Based Geometric Consistency Densification,”
Benchmarking Large-Scale Scene Reconstruction,” ACM Transactions arXiv preprint arXiv:2407.11840, 2024.
on Graphics, vol. 36, no. 4, 2017. [36] Z. Fan, W. Cong, K. Wen, K. Wang, J. Zhang, X. Ding, D. Xu,
[15] Y. Liao, J. Xie, and A. Geiger, “KITTI-360: A novel dataset and B. Ivanovic, M. Pavone, G. Pavlakos, Z. Wang, and Y. Wang, “In-
benchmarks for urban scene understanding in 2d and 3d,” IEEE stantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in
Transactions on Pattern Analysis and Machine Intelligence, 2022. 40 Seconds,” arXiv preprint arXiv:2403.20309, 2024.
[16] J. Lin and F. Zhang, “R3 LIVE: A Robust, Real-time, RGB-colored, [37] W. Yin, C. Zhang, H. Chen, Z. Cai, G. Yu, K. Wang, X. Chen,
LiDAR-Inertial-Visual tightly-coupled state Estimation and mapping and C. Shen, “Metric3D: Towards Zero-shot Metric 3D Prediction
package,” in Proceedings of the IEEE International Conference on from A Single Image,” in Proceedings of the IEEE/CVF International
Robotics and Automation (ICRA), 2022. Conference on Computer Vision (ICCV), 2023.
[17] J. L. Schonberger and J.-M. Frahm, “Structure-From-Motion Revis- [38] E. Ververas, R. A. Potamias, J. Song, J. Deng, and S. Zafeiriou,
ited,” in Proceedings of the IEEE/CVF Conference on Computer Vision “SAGS: Structure-Aware 3D Gaussian Splatting,” arXiv preprint
and Pattern Recognition (CVPR), 2016. arXiv:2404.19149, 2024.
[18] C. Campos, R. Elvira, J. J. Gómez, J. M. M. Montiel, and J. D. Tardós, [39] S. Lombardi, T. Simon, G. Schwartz, M. Zollhoefer, Y. Sheikh, and
“ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual- J. Saragih, “Mixture of Volumetric Primitives for Efficient Neural
Inertial and Multi-Map SLAM,” IEEE Transactions on Robotics, Rendering,” ACM Transactions on Graphics, vol. 40, no. 4, 2021.
vol. 37, no. 6, pp. 1874–1890, 2021. [Online]. Available: [Link]
[19] T. Shan and B. Englot, “LeGO-LOAM: Lightweight and Ground- [40] T. Xie, Z. Zong, Y. Qiu, X. Li, Y. Feng, Y. Yang, and C. Jiang, “Phys-
Optimized Lidar Odometry and Mapping on Variable Terrain,” in Gaussian: Physics-Integrated 3D Gaussians for Generative Dynamics,”
Proceedings of the IEEE/RSJ International Conference on Intelligent arXiv preprint arXiv:2311.12198, 2023.
Robots and Systems (IROS), 2018. [41] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image
[20] T. Qin, P. Li, and S. Shen, “VINS-Mono: A Robust and Versatile quality assessment: from error visibility to structural similarity,” IEEE
Monocular Visual-Inertial State Estimator,” IEEE Transactions on Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
Robotics, vol. 34, no. 4, pp. 1004–1020, 2018. [42] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The
[21] H. Matsuki, R. Murai, P. H. Kelly, and A. J. Davison, “Gaussian Unreasonable Effectiveness of Deep Features as a Perceptual Metric,”
Splatting SLAM,” in Proceedings of the IEEE/CVF Conference on in Proceedings of the IEEE/CVF Conference on Computer Vision and
Computer Vision and Pattern Recognition (CVPR), 2024. Pattern Recognition (CVPR), 2018.
[43] Z. Yu, A. Chen, B. Huang, T. Sattler, and A. Geiger, “Mip-Splatting:
Alias-free 3D Gaussian Splatting,” in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR),
2024.
[44] J. Jung, J. Han, H. An, J. Kang, S. Park, and S. Kim, “Relaxing
Accurate Initialization Constraint for 3D Gaussian Splatting,” arXiv
preprint arXiv:2403.09413, 2024.

High-Fidelity RGBD SLAM with Gaussian Splatting
No ratings yet
High-Fidelity RGBD SLAM with Gaussian Splatting
7 pages
Splatt3R: Zero-shot 3D Gaussian Splatting
No ratings yet
Splatt3R: Zero-shot 3D Gaussian Splatting
10 pages
EasySplat: Efficient 3D Gaussian Splatting
No ratings yet
EasySplat: Efficient 3D Gaussian Splatting
7 pages
DENSER: Dynamic Urban Scene Reconstruction
No ratings yet
DENSER: Dynamic Urban Scene Reconstruction
7 pages
COLMAP-Free 3D Gaussian Splatting
No ratings yet
COLMAP-Free 3D Gaussian Splatting
17 pages
Depth-Regularized 3D Gaussian Splatting
No ratings yet
Depth-Regularized 3D Gaussian Splatting
10 pages
Scaffold-GS: View-Adaptive 3D Rendering
No ratings yet
Scaffold-GS: View-Adaptive 3D Rendering
11 pages
3D Gaussian Splatting: A Comprehensive Survey
No ratings yet
3D Gaussian Splatting: A Comprehensive Survey
20 pages
DENSER: 3D Gaussian Splatting for Urban Scenes
No ratings yet
DENSER: 3D Gaussian Splatting for Urban Scenes
7 pages
Spacetime Gaussian Splatting for View Synthesis
No ratings yet
Spacetime Gaussian Splatting for View Synthesis
13 pages
Efficient Surface Reconstruction via PGSR
No ratings yet
Efficient Surface Reconstruction via PGSR
17 pages
DENSER: Dynamic Urban Scene Reconstruction
No ratings yet
DENSER: Dynamic Urban Scene Reconstruction
7 pages
Survey on 3D Gaussian Splatting
No ratings yet
Survey on 3D Gaussian Splatting
22 pages
GaMeS: Real-Time Gaussian Splatting Adaptation
No ratings yet
GaMeS: Real-Time Gaussian Splatting Adaptation
13 pages
MVSplat: Fast 3D Gaussian Splatting
No ratings yet
MVSplat: Fast 3D Gaussian Splatting
23 pages
InstantSplat: Fast Sparse-View 3D Reconstruction
No ratings yet
InstantSplat: Fast Sparse-View 3D Reconstruction
12 pages
RGB-D Neural Surface Reconstruction
No ratings yet
RGB-D Neural Surface Reconstruction
12 pages
Gaussian-SLAM: Fast Photo-Realistic Rendering
No ratings yet
Gaussian-SLAM: Fast Photo-Realistic Rendering
15 pages
VastGaussian for Large Scene Reconstruction
No ratings yet
VastGaussian for Large Scene Reconstruction
12 pages
SplaTAM: 3D Gaussian SLAM Method
No ratings yet
SplaTAM: 3D Gaussian SLAM Method
11 pages
Dynamic 3D Gaussians for Scene Tracking
No ratings yet
Dynamic 3D Gaussians for Scene Tracking
11 pages
2D Gaussian Splatting for Radiance Fields
No ratings yet
2D Gaussian Splatting for Radiance Fields
11 pages
Street Gaussians for Urban Scene Modeling
No ratings yet
Street Gaussians for Urban Scene Modeling
13 pages
F3D-Gaus: Efficient 3D Generation from Monocular Data
No ratings yet
F3D-Gaus: Efficient 3D Generation from Monocular Data
15 pages
Optimizing 3D Gaussian Splatting on Low-End GPUs
No ratings yet
Optimizing 3D Gaussian Splatting on Low-End GPUs
15 pages
Asplos24 Gscore
No ratings yet
Asplos24 Gscore
15 pages
Deformable Beta Splatting for 3D Rendering
No ratings yet
Deformable Beta Splatting for 3D Rendering
14 pages
Memory-Efficient 3D Gaussian Splatting
No ratings yet
Memory-Efficient 3D Gaussian Splatting
10 pages
Deep Learning for Terrain Recognition
No ratings yet
Deep Learning for Terrain Recognition
4 pages
3D Gaussian Splatting for Real-Time Rendering
No ratings yet
3D Gaussian Splatting for Real-Time Rendering
14 pages
Gaussian Splashing for Dynamic Fluids
No ratings yet
Gaussian Splashing for Dynamic Fluids
11 pages
Reduced 3DGS I3d
No ratings yet
Reduced 3DGS I3d
17 pages
NeRF-SLAM: Real-Time Monocular Mapping
No ratings yet
NeRF-SLAM: Real-Time Monocular Mapping
10 pages
Gaussian Opacity Fields for Surface Reconstruction
No ratings yet
Gaussian Opacity Fields for Surface Reconstruction
12 pages
Mip-Splatting: Enhancing 3D Gaussian Splatting
No ratings yet
Mip-Splatting: Enhancing 3D Gaussian Splatting
10 pages
IndoorGS: Enhanced Indoor Scene Reconstruction
No ratings yet
IndoorGS: Enhanced Indoor Scene Reconstruction
10 pages
3D Gaussian Splatting for Virtual Environments
No ratings yet
3D Gaussian Splatting for Virtual Environments
2 pages
Topology-Aware 3D Gaussian Splatting Leveraging Persistent Homology For Optimized Structural Integrity
No ratings yet
Topology-Aware 3D Gaussian Splatting Leveraging Persistent Homology For Optimized Structural Integrity
18 pages
3D Gaussian Splatting for SLAM
No ratings yet
3D Gaussian Splatting for SLAM
20 pages
Street Gaussians for Urban Scene Modeling
No ratings yet
Street Gaussians for Urban Scene Modeling
26 pages
Locality-Aware Gaussian Compression for 3D Rendering
No ratings yet
Locality-Aware Gaussian Compression for 3D Rendering
28 pages
Enhancing 3D Gaussian Splatting
No ratings yet
Enhancing 3D Gaussian Splatting
18 pages
Lumina: Efficient Mobile Neural Rendering
No ratings yet
Lumina: Efficient Mobile Neural Rendering
15 pages
Holistic Urban Scene Understanding
No ratings yet
Holistic Urban Scene Understanding
10 pages
Semantics-Controlled Gaussian Splatting in VR
No ratings yet
Semantics-Controlled Gaussian Splatting in VR
11 pages
DepthSplat: Merging Gaussian Splatting and Depth
No ratings yet
DepthSplat: Merging Gaussian Splatting and Depth
15 pages
3D Game Development with Gaussian Splatting
No ratings yet
3D Game Development with Gaussian Splatting
1 page
Vision-Only Robot Navigation with NeRF
No ratings yet
Vision-Only Robot Navigation with NeRF
8 pages
RaySplats: Ray Tracing for 3D Splatting
No ratings yet
RaySplats: Ray Tracing for 3D Splatting
23 pages
MCMC-Based 3D Gaussian Splatting
No ratings yet
MCMC-Based 3D Gaussian Splatting
15 pages
3d Gaussian Splatting High
No ratings yet
3d Gaussian Splatting High
14 pages
Indoor Mapping for Autonomous Vehicles
No ratings yet
Indoor Mapping for Autonomous Vehicles
6 pages
6DGS: Real-Time 6D Pose Estimation
No ratings yet
6DGS: Real-Time 6D Pose Estimation
21 pages
3D Reconstruction Methods in Industrial Settings A Comparative Study For COLMAP, NeRF and 3D Gaussian Splatting
No ratings yet
3D Reconstruction Methods in Industrial Settings A Comparative Study For COLMAP, NeRF and 3D Gaussian Splatting
6 pages
Enhancing GPU Rasterizers for 3DGS
No ratings yet
Enhancing GPU Rasterizers for 3DGS
7 pages
Gaussian Splatting for Monocular SLAM
No ratings yet
Gaussian Splatting for Monocular SLAM
21 pages
Neural Surface Reconstruction via Sonar
No ratings yet
Neural Surface Reconstruction via Sonar
8 pages
NeRF Implementation for 3D Reconstruction
No ratings yet
NeRF Implementation for 3D Reconstruction
5 pages
Business Message Planning Guide
No ratings yet
Business Message Planning Guide
24 pages
HDFC Life Customer Support FAQs
No ratings yet
HDFC Life Customer Support FAQs
6 pages
JavaScript Obfuscation Example
No ratings yet
JavaScript Obfuscation Example
11 pages
6 2.1.A.AK IsometricSketchingAnswerKey
No ratings yet
6 2.1.A.AK IsometricSketchingAnswerKey
6 pages
Power Series Method for ODEs
No ratings yet
Power Series Method for ODEs
23 pages
HMAC and SHA-512 Padding Calculations
No ratings yet
HMAC and SHA-512 Padding Calculations
4 pages
AI-Enabled Ornithopter Mini UAV Design
No ratings yet
AI-Enabled Ornithopter Mini UAV Design
4 pages
Abdul Jabbar ATS CV-1
No ratings yet
Abdul Jabbar ATS CV-1
3 pages
'Kirloskar Green' 15-250kVA Geneset
0% (1)
'Kirloskar Green' 15-250kVA Geneset
6 pages
EEE PC 701 PCB Block Diagram
No ratings yet
EEE PC 701 PCB Block Diagram
50 pages
Engineering Mathematics Course Overview
No ratings yet
Engineering Mathematics Course Overview
18 pages
Angular Built-in Directives Overview
No ratings yet
Angular Built-in Directives Overview
6 pages
Jaipur to Ujjain Train Reservation Details
No ratings yet
Jaipur to Ujjain Train Reservation Details
2 pages
Crestron Mercury Brochure 2017
No ratings yet
Crestron Mercury Brochure 2017
16 pages
MAG Forge M100A Case Specifications
No ratings yet
MAG Forge M100A Case Specifications
1 page
Trion 8000 Studio Microphone Overview
No ratings yet
Trion 8000 Studio Microphone Overview
1 page
502 Content Protection For HTTP Live Streaming PDF
No ratings yet
502 Content Protection For HTTP Live Streaming PDF
112 pages
Branson 2000LP Low Power Ultrasonic Systems
No ratings yet
Branson 2000LP Low Power Ultrasonic Systems
4 pages
ECON20003: QM2 Course Overview
No ratings yet
ECON20003: QM2 Course Overview
13 pages
Li-ion Battery Modeling with Circuit Diagrams
100% (1)
Li-ion Battery Modeling with Circuit Diagrams
5 pages
Power BI Sales Insights Analysis Report
No ratings yet
Power BI Sales Insights Analysis Report
11 pages
Understanding Your Role - Final Action Plan
No ratings yet
Understanding Your Role - Final Action Plan
5 pages
Firewall Implementation Best Practices
No ratings yet
Firewall Implementation Best Practices
2 pages
Ov 6
No ratings yet
Ov 6
50 pages
Leica TS11 Quick Guide
No ratings yet
Leica TS11 Quick Guide
12 pages
Dry Type Transformer Testing Procedures
100% (3)
Dry Type Transformer Testing Procedures
9 pages
C Programming Lab Assignments Solutions
No ratings yet
C Programming Lab Assignments Solutions
15 pages
Ironic Playground: Kids and Screens
No ratings yet
Ironic Playground: Kids and Screens
2 pages
Internet Access Software for E-Business
No ratings yet
Internet Access Software for E-Business
17 pages
Algerian Arabic Speech Database Overview
No ratings yet
Algerian Arabic Speech Database Overview
10 pages

Monocular Depth for 3D Gaussian Splatting

Uploaded by

Monocular Depth for 3D Gaussian Splatting

Uploaded by

Mode-GS: Monocular Depth Guided Anchored 3D Gaussian Splatting

for Robust Ground-View Scene Rendering

Abstract— We present a novel-view rendering algorithm, Ours

splat drift due to scene complexity and insufficient multi-view

parameterize anchors with per-view depth-scales and employ

Fig. 1. Our Mode-GS integrates monocular depth estimation with anchored

II. R ELATED W ORKS

3DGS GOF Scaffold-GS Ours GT

You might also like