## 1. Introduction

Transportation networks including those for rail, air, road and maritime transport play a critical role for the economy in terms of production, consumption and international trade. The maritime sector helps to move more than 90% of world trade by volume. Network research is an emerging area applied to various disciplines including maritime transport. In general, network comprises of two basic components including nodes and edges. A node is a set of basic elements, entities or fundamental units. An edge is the set of node pair which presents the relationship or connection between nodes. The connectivity between nodes of a network depends on the nature of the network. In transport networks, nodes are transport ports or terminals connected by transport service. In a maritime transport network, nodes or vertex can be seaports, dry ports, terminals and depots, and links or edges refer to transport service such as shipping, road, rail and air transport services.

In the last decade, various aspects of maritime networks have been studied including network flows (Ducruet, 2016), worldwide production network (Jacobs et al., 2010), liner cooperation or port resilience networks (Justice et al., 2016; Wilmsmeier, 2016) and intermodal transport networks (Lee et al., 2014). Many studies have focused on network configurations and organizations, e.g. the hierarchies and statuses of port networks (Ducruet et al., 2010a; Ducruet et al., 2010b), competition and cooperation (Jacobs, 2007), the containerization, globalization, and regionalization of ports (Notteboom and Rodrigue, 2005; Rodrigue and Notteboom, 2013), network competition and topology analysis (Nguyen, 2014; Tsiotas and Polyzos, 2015), operations research of networks (Christiansen et al., 2013; Meng et al., 2013).

Recently, social network analysis (SNA) has been emerging as a new approach to study networks and applied to various disciplines such as sociology, biology, economics, information, management and economics. The main focus of SNA is on the formation and connectivity of a network as a whole, while allowing for the random nature of connectivity and its effect on network formation. It also covers new aspects of a network, such as centrality measures, network density and diameter, and clusters in a network that have not been considered from the perspectives of operations research and other approaches. This study’s objective is to review the SNA applications to maritime transport. This paper is divided into four sections. The next section provides the related measure and statistical properties used to analyse the network. Section 3 explains the network models and their applications to maritime transport research. Section 4 is the discussion of the implications for future research and conclusion.

## 2. Network related measures and statistics

Network measures are descriptive statistics of a network providing essential information about its properties. This section presents the properties of a network in terms of its key measures and statistics, which are then used in network models to be reviewed in this paper. A network, often presented a graph is denoted as *G*(*N,L*) comprising of a set of nodes *N* ={*n*_{1},*n*_{2},…,*n _{N}*} and a set of edges

*L*={

*l*

_{1},

*l*

_{2},…,

*l*} that connect the nodes. The graph

_{L}*G*of a binary network, in which a link can be either zero or one, yes or no, can be represented using an adjacency matric

*Y*with dimension

*N*×

*N*, where

*y*= 1 if an edge exists from nodes

_{ij}*i*to

*j*,

*y*= 0 if otherwise. Alternatively, a binary network can also be expressed using its edge list

_{ij}*E*which has two column matrices with edge

*l*given by

*l*

^{th}row of

*E*,

*e*denotes the origin node of edge

_{l1}*l*and

*e*is the destination node of the edge. In a weighted network, in which a link can take any positive value as opposed to zero and one, a graph

_{l2}*G*will be described in terms of non-negative integer values for the entities in

*Y*.

One important measure of a network is its density, which indicates the level of its connectivity between its nodes. The density of a network is defined as the proportion of the total number of edges *m*(*G*) and the maximum possible number of edges *m _{max}*

_{(G)}(Oliveira and Gama, 2012):

where *m _{max}*

_{(G)}is

*n*(

*n*−1)/2 for the undirected network,

*n*(

*n*−1) for the directed network. A high value of density indicates that the nodes are closely connected (Yang et al., 2018). The density can also be used to identify a complete network, namely if the network has

*ρ*= 1 which implies that a network is complete, i.e. with all nodes connected. Conversely, if a network has

*ρ*< 1 which means the network is sparse or incomplete.

For a maritime transport network, the density measure can be used to indicate the level of service coverage. Hu and Zong (2013) used the network density concept to describe the tightness of the port-shipping network in China. Ducruet and Notteboom (2012) used the network density measure to examine the characteristics of the global container shipping network, while Tsiotas and Polyzos (2015) analysed the density of the maritime transport system in Greece.

Degree distribution indicates the connectivity of nodes in a network and is one of the most important network statistics (Hu and Zong, 2013). It measures the probability distribution (*p _{k}*) which comes from the fraction of nodes which have degree

*k*(

*n*) and the total numbers of nodes (

_{k}*N*) in a network:

The degree distribution reflects the overall or global connectivity of the network system (Liu et al., 2012). For example, in a Binomial random graph model (presented below), where *p* denotes the probability that a node is connected by an edge, then the degree distribution is (Jackson, 2011):

For a network with a large *n* and small *p*, the degree distribution is can be approximated by Poisson distribution:

When degree distribution follows the power law, a network is scale free. Namely, there are a small number of nodes that have high connectivity, and a large number of nodes that have low connectivity (Liu et al., 2018b).

Degree distribution has been used in many studies on maritime transport networks to examine the characteristics of shipping and port networks. For example, Tsiotas and Polyzos (2015) used *degree distribution* to investigate the characteristics of the maritime transport system in Greece. It is used to analyse topological property of the “Maritime Silk Road” shipping network and especially for container shipping (Jiang et al., 2019). It also is used to study the characteristic of the China, Japan and Korea shipping network (Guo et al., 2017). Existing studies have also used degree distribution to explore the characteristics of three types of network, i.e. regular, scale-free and random networks (Bagler, 2008; Jiang et al., 2019; Tsiotas et al., 2018).

Centrality is used to explore a network at the local level, i.e. concerning a node, especially its importance or popularity through its connectivity with other nodes in the network. SNA uses various centrality measures. Degree centrality, the most basic measure of centrality, of a node is defined as the sum of a number of edges connected to it (Freeman, 1978):

where *a _{ij}* is one if nodes

*i*and

*j*are linked by an edge and is zero otherwise. Given the role of ports as logistics nodes providing access to other ports and the hinterland, the degree centrality is a useful indicator of how well the port is connected in the network; a high degree centrality indicates that it is important (Zhang et al., 2018), well connected in the network or region (Hadas et al., 2017), highly influential (Lu et al., 2018). It can also be used to indicate the importance of a hub and central port in the network (Jiang et al., 2019; Tran and Haasis, 2014).

For a weighted network, the degree centrality of a node can be calculated in terms of the strength centrality based on the weights of edges, not on the number of edges (Opsahl et al., 2010). So, the strength centrality is used to analyse the strength of a node in a weighted network by estimating the sum of weights of the edges which connect to it (Liu et al., 2018a):

where *V* is the set of neighbours of node *i*, *w _{ij}* is the weight of an edge which connects between nodes

*i*and

*j*. The strength centrality is used to analyse the hierarchical and connectivity level of ports. Ports with a high strength centrality dominate the shipping network and hold leading positions. This measure is also used along with others measure to define the ego of a network to analyse the spatial heterogeneity of the network (Liu et al., 2018b).

While degree centrality can be used to show the importance of a node, it only concerns its immediate neighbours and does not consider the nodes in the network that are indirectly connected to it. Betweenness centrality overcomes this limitation of the degree centrality measure. It is a measure of the extent to which a node lies on paths between other nodes in a network (Newman, 2010). Thus, it considers the role of a node through its indirect links with other nodes in the entire network. Formally, betweenness centrality of a node is the sum of the fraction of the shortest paths (geodesic) that pass through it (Newman, 2005):

where *σ _{st}*(

*i*) is a number of geodesics from nodes

*s*to

*t*that pass through node

*i*and

*σ*is the number of geodesics from nodes

_{st}*s*to

*t*. Nodes with high betweenness may have much influence in a network through their control over information passing between other nodes (Newman, 2010). This measure can be used to identify the central role of a port in a region; ports with high betweenness centrality have more potential accessibility (Hadas et al., 2017). The measure can also be applied to find the ports which play the intermediary role or as a hub port in the network (Jeon et al., 2019). Hu and Zhu (2009) argued that the betweenness centrality can also be used with degree centrality to identify the potentially congested ports in a busy maritime network.

The fourth measure of centrality is closeness centrality, a measure based on the average shortest paths between nodes (Li et al., 2015). Thus, the closeness centrality of a node is defined as the inverse of average farness, which is the average length of the shortest paths from that node to all of the other nodes in the network (Jackson, 2010):

where *l*(*i,j*) is a number of edges in the geodesic between nodes *i* and *j*. Nodes have a higher closeness centrality are important more than other nodes (Zhao et al., 2014).

Because the average shortest path of a port indicates the average number of shipping routes passing through it in the network (Tsiotas and Polyzos, 2015), closeness centrality can be used to evaluate the central or hub ports in the network based on the shortest path concept (Kang and Woo, 2017). In other words, it can be used to assess the reachability of a port to the other ports in the studied region (Li et al., 2015). It is also a useful measure of a shipping lines’ service coverage or “shipping accessibility” of ports in the network (Zhao et al., 2014).

The betweenness and closeness measures of centrality mentioned above indicate the centrality of a node purely in terms of its connectivity with the other nodes in the network. One aspect of these measures is that they treat ‘the other nodes’ equally. For example, two nodes having the same value of degree centrality are regarded as equally important in the network. The above centrality measures overlook the importance of the other nodes that a node is connected to in the network; all else being the same, a port with a link to a central or hub port is considered as more important than one with a link to a regional or peripheral port.

Eigenvector centrality overcomes the limitation of the above centrality measures; it measures the centrality of a node in terms of the connectivity within the network as well as the importance of those that are connected to it. This means the calculation of the centrality of node requires the calculation of the centralities of the other nodes (*j≠i*) that are connected to node *i*. The use of eigenvector helps to solve this interdependence issue (Bonacich, 1972), where a node becomes important because it is connected to other important nodes (Jackson, 2010). Eigenvector centrality is proportional to the sum of neighbours’ centrality:

where *a* is the constant, *y _{ij}* is one if

*j*is connected to

*i*and zero otherwise,

*C*is the centrality of node

_{j}*j*(Jackson, 2010). Since

*y*are the elements of adjacent matrix

_{ij}*Y*, solving the system of equation (9-a) for all nodes is equivalent to solving for the eigenvector

*C*of

_{e}*C*(

_{e}*i*) in the following equation:

Because eigenvector centrality indicates the importance of ports in terms of its connectivity with important ports in the network (Lu et al., 2018), it can be used to evaluate the central role of ports (Shanmukhappa et al., 2018) and to measure the “development potential” of a port (Zhao et al., 2014).

A clustering coefficient is a measure of a network considering its clusters. It is based on the notion that nodes with similar characteristics or attributes tend to cluster into the same group (cluster) (Holland and Leinhardt, 1971; Watts and Strogatz, 1998). The clustering coefficient measures the network cohesion at the local and global level (Hu and Zhu, 2009). At the local level, consider the neighborhood of a node. The local clustering coefficient of a node (*C _{i}*) is the number of links between the nodes within its neighborhood (

*E*) divided by the number of possible links between them (Watts and Strogatz, 1998):

_{i}where *k _{i}* is a number of nodes in the neighborhood of node

*i*.

At the global level, the global clustering coefficient of a network is the sum of clustering coefficients of all nodes divided by the number of nodes (*n*) in the network:

Nodes with similar characteristics are classified into the same cluster leading to a network with a high density of edges. A node that has a high clustering coefficient is likely to be connected to other nodes with a short distance (Wang et al., 2011). This measure can be applied to study the characteristics of maritime networks, especially the level of shipping connection between the neighborhood of a port (Guo et al., 2017; Jiang et al., 2019). It can be used as an index showing the local shipping services around a port (Tsiotas and Polyzos, 2015). The hierarchical structure of a maritime transport network also can be analysed using the clustering coefficient (Liu et al., 2018a).

The average shortest path length of a network (*L*) is the sum of the shortest paths (geodesics) for all pairs of the nodes (*d _{ij}*) divided by the maximum possible number of edges:

*L* can be used to measure the efficiency of a network in terms of its connectivity; a small value means the network’s connectivity is highly efficient and vice versa. In maritime transport networks, the smaller, the value of the average distance between any two ports, the less transshipment and cargo transfer required to transport goods within the network (Hu and Zhu, 2009; Hu and Zong, 2013). The average shortest path length is also used to analyse the cooperative level within a network. A network with a small number of average shortest path length is considered as a tight network where ports are closely connected to each other (Caschili et al., 2014).

Assortativity is the measure of tendency in which nodes are mostly linked to nodes of similar attributes (Piraveenan et al., 2009). In other words, there is a tendency for a node to connect with other nodes similar to it in some ways.

Typically, the similarity is often measured in terms of nodes’ degrees. The literature has several measures of assortativity. For example, it can be calculated as the average nearest neighbor degrees of a node in the network (Hu and Zhu, 2009):

where *k _{nn,i}* is the average degree of the nearest neighbours of a node

*i*with degree

*k*, denoted as

*k*(

_{nn}*k*). Networks will be called ‘assortative’ if

*k*(

_{nn}*k*) is an increasing function of

*k*. If

*k*(

_{nn}*k*) is a decreasing function of

*k*, networks will be referred to as ‘disassortative’. In the case of weighted networks, the average degree of the nearest neighbours can be calculated by:

Assortativity can be used to probe the architecture or the hierarchy structure of a maritime network (Hu and Zhu, 2009). Tran and Haasis (2014) stated that shipping lines tend to connect ports which have similar degree characteristics. In addition, the assortativity used to be one of the other topological properties to analyse the network structure (Ducruet and Notteboom, 2012). It also used to analyse the connection of ports in the multilayer network (Ducruet, 2017). Table 1 provides a summary of all measures to give a better understanding of network statistical properties.

## 3. Network models, their applications to maritime transport and implications for future research

This and next sections review the network models that have been applied to maritime economic research. Among this is the random graph model, also known as Erdős and Rényi (1959) model that integrates graph theory and probability theory (Frieze and Karoński, 2016). This model is often applied as a benchmark to study the topology and connectivity of a network (Karyotis and Khouzani, 2016). The assumption of the model is the random assignment of edges to nodes and lead to various properties of network structure (Newman, 2003). The model can be specified depending on the probability distribution resulting in specific random graphs.

A uniform random graph is often denoted as *G*_{(n,m)} with *n* being the number of nodes and *m* being the number of links or edges where $0\le m\le \left(\begin{array}{c}n\\ 2\end{array}\right)$. A random graph model is based on the uniform probability distribution of links (Frieze and Karoński, 2016):

A binomial random graph is often denoted as *G*_{(n,p)}, where *n* being a number of nodes and *p* being a probability that a link is assigned to a pair of nodes. The probability of a graph *G*_{(n,p) }can be calculated as follows (Frieze and Karoński, 2016):

The random graph model focuses only on topologies of networks with a probability distribution of edges. The model does not consider how edges and nodes are formed (Cai, 2017). In maritime transport networks, even there is no research has been done in network analysis using this model. A forthcoming paper uses the random graph model to analyse the formation of a cruise shipping network and its properties.

As an alternative to the random graph model, the block model aims to decompose a network into clusters by mapping nodes that have similar characteristics into clusters or groups (Salter-Townshend et al., 2012). The model provides an analysis of interactions between and within network clusters. Thus, the connection between nodes *i* and *j* is not formed randomly but instead depends on their characteristics. The probability of the link between *i* and *j* in the block model can be specified as (Jackson, 2010):

where *X _{i}* and

*X*are the vectors of characteristics of

_{j}*i*and

*j*, while

*β*and

_{i}*β*are vector parameters of

_{j}*i*and

*j*respectively

*.*The model can be used to check the homophily of a network (McPherson et al., 2001). Typically block modelling is used to analyse community detection (Oliveira and Gama, 2012). The model can be applied to identify clusters in a maritime network in terms of the stochastic block model (SBM) (Bouveyron et al., 2015). Thus, it can be applied to analyse port choice decisions; how various factors affect the decision to run shipping service between two ports.

Although the random graph and block models presented above can be applied to understand many properties of a network, these models need to be extended to investigate the homophily and the effect of various factors on network formation (Newman, 2003). The exponential random graph model (ERGM) of *G*_{(n,p)} model is deployed to fill this graph. ERGM is a statistical model of social networks that explains the formation of networks, network links and relationships. It is based on the assumption that the emergence of a relationship might be influenced by other relationships or individual characteristics, represented as (Robins, et al. 2007):

where *k* is the normalizing constant to ensure that (18) is a proper probability distribution, *η _{A}* is the parameter vectors corresponding to the configuration of type

*A*, and

*g*(

_{A}*y*) is network statistic.

ERGM can be used to infer the formation of relationships in a network (Jiao et al., 2017) and to study the relationships between individuals and factors influential to the connection establishment (Cranmer and Desmarais, 2011). Because ERGM can be applied to simulate the random networks with a set of components, it makes use of essential information about the network structure such as degree distribution and the number of edges, triangles and connected components (Mukherjee, 2011). The model could be adopted to maritime transport networks and to investigate the homophily of ports and the influence of ports’ characteristics on the network.

Despite its potential application to transport network research, it is interesting to note that no study has been found applying ERGM, with a notable exception is Guo et al. (2017) that applied the Barabási-Albert scale-free random network model with a preferential attachment mechanism. The author found the degree distribution of the China-Japan-Korea shipping network follows the power-law distribution. Moreover, the maritime traffic flow exhibits a hierarchy where hub ports play a central role in network connectivity.

Although network measures and models are applied to many disciplines of network research, limited research has been found applying network analysis, especially social network analysis, such as the block model, random graph model, and exponential random graph models. Thus, future research can focus more on SNA applications to investigate the formation, connectivity, centrality, relationship and cooperation between nodes in transport networks. For example, the cooperation of ports in a network can be analysed through the application of the game theory to networks combining game-theoretic and network analyses. The connectivity of the maritime network affected by ports’ attributes and so on.

As many networks are not ‘fixed’ in nature but instead constantly expanding and changing with new nodes and link being formed, while some current nodes and links vanished, network dynamics and its effects on the behaviour of nodes and links merit further investigation. Future research could, for example, examine how cruise ship voyages can be planned considering the available (presumably fixed) network’s nodes as well as available voyages. The paper does not show the relation of network analysis model which can be connected to other models, for example, optimization model. So, future work can also incorporate other network aspects and models into SNA.

To be more practical, the analysis can be extended to allow for dynamic networks with changing nodes and links (voyages). For simplicity, much research assumes homogenous nodes and undirected links, while they are not. Therefore, future research may relax such an assumption. Figure 1 presents a maritime network analysis (MNA) framework as an example of how SNA can be applied to maritime transport.

## 4. Conclusion

This paper reviewed the applications of the network theory especially social network analysis models to maritime networks, where ports are nodes linked by shipping services. It has been found that various network measures are useful in helping to explore the properties of transport networks. For example, density can be used to measure the connectivity of the entire network, at the global level. Similarly, degree distribution can be used to explore the topology of maritime networks, i.e. whether the network is scale-free, depends on whether degree distribution follows the power law.

At the local level, the importance of an individual node in the network can be evaluated using various measures of centrality, including degree, betweenness, closeness and eigenvector centrality. While degree centrality indicates the connectivity of a node to other nodes in the network, it only takes into account the nearest neighbour nodes. One the other hand, betweenness and closeness centrality can be used to measure the role of an individual node in the entire network, given the number of shortest links passing through it and distance to all other nodes in the network, respectively. However, none of these measures takes into account the importance of the nodes that a node understudy is connected to. This is where eigenvector centrality can be applied, giving the indication of ‘not what you know but who you know’.

SNA can also be applied to study clusters or subnetworks within a network. The relationship between ports in the same group can be analysed using the clustering coefficient base on their characteristics or attributes. This measure is also used to analyse the topology of networks in terms of the small-world network (high clustering coefficient). While the assortativity is useful for analysis of the connectivity of ports with similar attributes. The average shortest path length is deployed to analyse the efficiency of a transport network.

The random graph model provides a basic framework for further analysis of a network’s formation. It assumes that nodes can be connected by edges with independent probability addressing the random nature and many important resulting properties of social, economic and transport networks. The model can be specified depending on specific probability distributions, including uniform, binomial and Poisson random graphs. Block model analyses the network formation considering the characteristics of nodes belonging to different groups with similar attributes. This model can be applied to analyse the interaction between members within the group and between groups. The exponential random graph model is an extension to the random graph model and can be applied to examine the homophily and the effect of the influential factors on network formation.

Despite metrics and models reflect the properties of maritime networks and ports, limited research applied the SNA to investigate maritime transport in many aspects, especially spatial metrics. Thus, the metrics should be adopted or used along with other models to present spatial properties and visualization of the networks.