vector/sinks/webhdfs/mod.rs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
//! `webhdfs` sink.
//!
//! The Hadoop Distributed File System (HDFS) is a distributed file system
//! designed to run on commodity hardware. HDFS consists of a namenode and a
//! datanode. We will send rpc to namenode to know which datanode to send
//! and receive data to. Also, HDFS will rebalance data across the cluster
//! to make sure each file has enough redundancy.
//!
//! ```txt
//! ┌───────────────┐
//! │ Data Node 2 │
//! └───────────────┘
//! ▲
//! ┌───────────────┐ │ ┌───────────────┐
//! │ Data Node 1 │◄──────────┼───────────►│ Data Node 3 │
//! └───────────────┘ │ └───────────────┘
//! ┌───────┴───────┐
//! │ Name Node │
//! └───────────────┘
//! ▲
//! │
//! ┌──────┴─────┐
//! │ Vector │
//! └────────────┘
//! ```
//!
//! WebHDFS will connect to the HTTP RESTful API of HDFS.
//!
//! For more information, please refer to:
//!
//! - [HDFS Users Guide](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html)
//! - [WebHDFS REST API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/WebHDFS.html)
//! - [opendal::services::webhdfs](https://docs.rs/opendal/latest/opendal/services/struct.Webhdfs.html)
//!
//! `webhdfs` is an OpenDal based services. This mod itself only provide
//! config to build an [`crate::sinks::opendal_common::OpenDalSink`]. All real implement are powered by
//! [`crate::sinks::opendal_common::OpenDalSink`].
mod config;
pub use self::config::WebHdfsConfig;
#[cfg(test)]
mod test;
#[cfg(all(test, feature = "webhdfs-integration-tests"))]
mod integration_tests;