environment basics

List of HADOOP port numbers

List of port numbers: ref link
NAMENODE PORT: 50070 => dfs.http.address
RESOURCE MANAGER PORT : 8088 => yarn.resourcemanager.webapp.address
JOB HISTORY PORT : 19888 => mapreduce.jobhistory.webapp.address
DATANODE PORT: 50075 => dfs.datanode.http.address
SECND. NAMENODE PORT: 50090 => dfs.secondary.http.address
CHK. POINT PORT: 50105 => dfs.backup.http.address

All Hadoop Daemons:
Exposes, for downloading, log files in the Java system property hadoop.log.dir.
Allows you to dial up or down log4j logging levels. This is similar to hadoop daemonlog on the command line.
Stack traces for all threads. Useful for debugging.
Metrics for the server. Use /metrics?format=json to retrieve the data in a structured form. Available in 0.21.


<namenode IP>:50070

Shows information about the namenode as well as the HDFS. There’s a link from here to browse the filesystem, as well.
Shows lists of nodes that are disconnected from (DEAD) or connected to (LIVE) the namenode.
Runs the “fsck” command. Not recommended on a busy cluster.
Returns an XML-formatted directory listing. This is useful if you wish (for example) to poll HDFS to see if a file exists. The URL can include a path (e.g., /listPaths/user/philip) and can take optional GET arguments: /listPaths?recursive=yes will return all files on the file system;/listPaths/user/philip?filter=s.* will return all files in the home directory that start withs; and /listPaths/user/philip?exclude=.txt will return all files except text files in the home directory. Beware that filter and exclude operate on the directory listed in the URL, and they ignore the recursive flag.
/data and /fileChecksum
These forward your HTTP request to an appropriate datanode, which in turn returns the data or the checksum.

Datanodes :

<datanode IP>:50075

/browseBlock.jsp, /browseDirectory.jsp, tail.jsp, /streamFile,/getFileChecksum
These are the endpoints that the namenode redirects to when you are browsing filesystem content. You probably wouldn’t use these directly, but this is what’s going on underneath.
Every datanode verifies its blocks at configurable intervals. This endpoint provides a listing of that check.

The secondarynamenode exposes a simple status page with information including which namenode it’s talking to, when the last checkpoint was, how big it was, and which directories it’s using.


Under the Covers for the Developer and the System Administrator

Internally, Hadoop mostly uses Hadoop IPC to communicate amongst servers. (Part of the goal of the Apache Avro project is to replace Hadoop IPC with something that is easier to evolve and more language-agnostic;HADOOP-6170 is the relevant ticket.) Hadoop also uses HTTP (for the secondarynamenode communicating with the namenode and for the tasktrackers serving map outputs to the reducers) and a raw network socket protocol (for datanodes copying around data).

The following table presents the ports and protocols (including the relevant Java class) that Hadoop uses. This table does not include the HTTP ports mentioned above.

Daemon Default Port Configuration Parameter Protocol Used for
Namenode 8020 fs.default.name? IPC: ClientProtocol Filesystem metadata operations.
Datanode 50010 dfs.datanode.address Custom Hadoop Xceiver: DataNode andDFSClient DFS data transfer
Datanode 50020 dfs.datanode.ipc.address IPC: InterDatanodeProtocol,ClientDatanodeProtocol
Block metadata operations and recovery
Backupnode 50100 dfs.backup.address Same as namenode HDFS Metadata Operations
Jobtracker Ill-defined.? mapred.job.tracker IPC: JobSubmissionProtocol,InterTrackerProtocol Job submission, task tracker heartbeats.
Tasktracker¤ mapred.task.tracker.report.address IPC: TaskUmbilicalProtocol Communicating with child jobs
? This is the port part of hdfs://host:8020/.
? Default is not well-defined. Common values are 8021, 9001, or 8012. See MAPREDUCE-566.
¤ Binds to an unused local port.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s