Create multiple directed edges in a networkx graph
I want to display the information in the following example dataset as a directed graph with multiple edges between nodes.
Attached is an example of the kind of graph I expect, as well as my code, which does not produce the expected output.
Thanks,
G = nx.from_pandas_edgelist(df, 'source', 'destination', edge_attr='number of passengers', create_using=nx.DiGraph())
pos = nx.random_layout(G)
nx.draw(G,
pos=pos)
edge_labels = nx.get_edge_attributes(G, "Edge_label")
nx.draw(G, with_labels=True)
nx.draw_networkx_edge_labels(G, pos, edge_labels)
plt.show()
date  source  destination  number of passengers 

20190101  NY  BERLIN  10 
20190102  NY  PARIS  50 
20190103  NY  BERLIN  40 
20190104  BERLIN  PARIS  20 
20190105  NY  PARIS  15 
20190106  BERLIN  NY  17 
do you know?
how many words do you know
See also questions close to this topic

How to replace values in pandas dataframe
My goal is to design a program that will take create a program that will replace unique values in a pandas dataframe.
The following code performs the operation
# replace values print(f" {s1['A1'].value_counts().index}") for i in s1['A1'].value_counts().index: s1['A1'].replace(i,1) print(f" {s2['A1'].value_counts().index}") for i in s2['A1'].value_counts().index: s2['A1'].replace(i,2) print("s1 after replacing values") print(s1) print("******************") print("s2 after replacing values") print(s2) print("******************")
Expected: The values in the first dataframe
s1
should be replaced with 1s. The values in the second dataframes2
should be replaced with 2s.Actual:
Int64Index([8, 5, 2, 7, 6], dtype='int64') Int64Index([2, 8, 5, 6, 7, 4, 3], dtype='int64') s1 after replacing values A1 A2 A3 Class 3 5 0.440671 2.3 1 9 8 0.070035 2.9 1 14 2 0.868410 1.5 1 29 6 0.587487 2.6 1 34 8 0.652936 3.0 1 38 8 0.181508 3.0 1 45 8 0.953230 3.0 1 54 7 0.737604 2.7 1 68 5 0.187475 2.2 1 70 5 0.511385 2.3 1 71 8 0.688134 3.0 1 73 2 0.054908 1.5 1 87 8 0.461797 3.0 1 90 2 0.756518 1.5 1 91 2 0.761448 1.5 1 93 5 0.858036 2.3 1 94 5 0.306459 2.2 1 98 5 0.692804 2.2 1 ****************** s2 after replacing values A1 A2 A3 Class 0 2 0.463134 1.5 3 1 8 0.746065 3.0 3 2 6 0.264391 2.5 2 4 2 0.410438 1.5 3 5 2 0.302902 1.5 2 .. .. ... ... ... 92 5 0.775842 2.3 2 95 5 0.844920 2.2 2 96 5 0.428071 2.2 2 97 5 0.356044 2.2 3 99 5 0.815400 2.2 3
Any help understanding how to replace the values in these dataframes would be greatly appreciated. Thank you.

Conda environment missing dependencies when trying to download THOR tracker
I am new to linux in general and I've recently got into vision coding. I am trying to download the THOR real time tracker from
https://github.com/xlsr/THOR
.While trying to make the conda environment I have got a missing dependencies error:
~/Downloads/THOR$ conda env create f environment.yml Solving environment: failed ResolvePackageNotFound:  cacertificates==2019.1.23=0  freetype==2.9.1=h8a8886c_1  ninja==1.9.0=py37hfd86e86_0  openssl==1.1.1c=h7b6447c_1  libgccng==8.2.0=hdf63c60_1  mkl==2019.4=243  readline==7.0=h7b6447c_5  libstdcxxng==8.2.0=hdf63c60_1  cudatoolkit==10.0.130=0  mkl_random==1.0.2=py37hd81dba3_0  olefile==0.46=py37_0  six==1.12.0=py37_0  pytorch==1.1.0=py3.7_cuda10.0.130_cudnn7.5.1_0  mkl_fft==1.0.12=py37ha843d7b_0  intelopenmp==2019.4=243  libffi==3.2.1=hd88cf55_4  cffi==1.12.3=py37h2e261b9_0  zstd==1.3.7=h0b5b093_0  numpybase==1.16.4=py37hde5b4d6_0  sqlite==3.28.0=h7b6447c_0  python==3.7.3=h0371630_0  jpeg==9b=h024ee3a_2  torchvision==0.3.0=py37_cu10.0.130_1  pillow==6.0.0=py37h34e0f95_0  libpng==1.6.37=hbc83047_0  ncurses==6.1=he6710b0_1  libedit==3.1.20181209=hc058e9b_0  libgfortranng==7.3.0=hdf63c60_0  libtiff==4.0.10=h2733197_2  zlib==1.2.11=h7b6447c_3  xz==5.2.4=h14c3975_4  numpy==1.16.4=py37h7e9f1db_0  blas==1.0=mkl
I have tried downloading a few of them with
sudo install
,pip
orconda
but the required version is not found or the package at all. The tracker is from 2019 and I do not think it has been updated since, but this is the best real time tracker I have found that does not require recognition and training which is exactly what I am looking for. I would appreciate any help or suggestions for other trackers that use GPU and do not need training and recognition. 
How to understand a code to calculate fluxes
I am unable to understand how this code work. Where window is a line with two points. Window is where we calculate the flux. I want to understand how exactly the code is working
times = np.zeros((len(x)), dtype=int) center_x, center_y = 0.5 * (window.point1[0] + window.point2[0]), 0.5 * (window.point1[1] + window.point2[1]) safety_value = 3.0 radius = safety_value * 0.5 * (math.sqrt((window.point1[0]  window.point2[0])**2 + (window.point1[1]  window.point2[1])**2)) # fictive boaders left_boarder, right_boarder = center_x  radius, center_x + radius bottom_boarder, up_boarder = center_y  radius, center_y + radius i = 1 while (i < len(x)1): if (left_boarder <= x[i] <= right_boarder) and (x[i] > x[i1]) and np.sign(window.perpendicular_distance_unnormalized(x[i], y[i])) != np.sign(window.perpendicular_distance_unnormalized(x[i1], y[i1])) and window.is_point_on_segment(x[i], y[i]) and window.is_point_on_segment(x[i1], y[i1]): times[i] += 1 i += 1 else: i += 1 return times
result = 0 for i in range(y_min, y_max): result += UX[x_, i] return result def calculate_vel_fluxes(ux, windows): vel_fluxes = np.empty((len(windows), ux.shape[0])) for i in range(len(windows)): for j in range(ux.shape[0]): vel_fluxes[i, j] = uFlux(ux[j], windows[i]) return vel_fluxes

How do i populate a dataframe with the results of a function that searches other dataframes?
I am trying to construct a dataframe that is populated by the results of a series of search functions across a number of data frames, and i don't know where to start  i'm new to python.
The results table i'm constructing is a matrix, with each indexed row referencing a data frame, and each column representing a list. The desired Data frame looks like:
answer_df List 1 List 2 List 3 P1 ? ? ? P2 ? ? ? P3 ? ? ? P4 ? ? ? P5 ? ? ? P6 ? ? ?
The values need to come from the results of an "is in" search function, where P1 is searched with the contents of each list.
Example dataframe:
P1 Index Diagnosis Meds Tests Obs 0 A12 NAN NAN NAN 1 B15 NAN NAN NAN 2 C28 NAN NAN NAN 3 NAN D22 NAN NAN 4 NAN E91 NAN NAN 5 NAN NAN F14 NAN 6 NAN NAN NAN M55 P2 Index Diagnosis Meds Tests Obs 0 K11 NAN NAN NAN 1 L01 NAN NAN NAN 2 C28 NAN NAN NAN 3 NAN X94 NAN NAN 4 NAN E91 NAN NAN 5 NAN NAN F14 NAN 6 NAN NAN Y02 NAN
A list example is:
List 1 A12 L01 D22 K88 F14 M55 N67 List 2 A12 F14 N64 P01 Y02 M55
I want to populate the
answer_df
by counting the number of matches betweenP1
/P2' and
List 1' /List 2
so that it looks like this:answer_df List 1 List 2 List 3 P1 4 3 ? P2 2 1 ? P3 ? ? ? P4 ? ? ? P5 ? ? ? P6 ? ? ?
But i also need to repeat this function for all other lists and dataframes (3*6 = 18 searches in total). Any help would be much appreciated

How do I correctly apply group by on pandas dataframe?
COL:SPM COL:BS COL:PL2 COL:PL3 sum CCTC BG OP OTH 1 CCTC BG tech OTH 3 CCTC BG OP OTH 5 CCTC BG Info OTH 10
I am applying groupby as I want my data to be grouped on all these columns spm,bs,pl2,pl3:
Expected Result:
COL:SPM COL:BS COL:PL2 COL:PL3 sum CCTC BG OP OTH 6 CCTC BG tech OTH 3 CCTC BG Info OTH 10
The result I am getting:
COL:SPM COL:BS COL:PL2 COL:PL3 sum CCTC BG OP OTH some unverified integer value df.groupby(['SPM','BS','PL2','PL3'])['sum'].sum().reset_index()
I am not understanding why I am getting the result wrong? I have searched on the net and I have been unsuccessful in finding the solution

Reading a specific type of file on bucket S3
I tryng to read a specific type of file on bucket S3 by databricks, but I'm not finding a away to do that, if you could help me, I'll appreciate a lot. Thanks
Case  OK:
df = spark.read.load('/mnt/dataset/json/file.json', format='json') df.display()
Case that I trying to do, NOT OK:
df = spark.read.load('/mnt/dataset/json/*.json', format='json') df.display()

How to draw two figures in one using networkx?
I am trying to draw two figures together, but they come out displaced one with respect to the other. First, I define the badminton pitch with a function.
def bad_courtV(ax=None, color='black'): # If an axes object isn't provided to plot onto, just get current one if ax is None: ax = plt.gca() ax.plot([0,61],[0,0], color=color) ax.plot([61,61], [0,134],color=color) ax.plot([61,0], [134,134],color=color) ax.plot([0,0], [134,0],color=color) ax.plot([0,61],[67,67], color=color, linewidth=4) ax.plot([4.6,4.6],[0,134],color=color) ax.plot([56.4,56.4],[0,134], color=color) ax.plot([61,0],[126.4,126.4], color=color) ax.plot([0,61],[7.6,7.6], color=color) ax.plot([61,0],[86.8,86.8], color=color) ax.plot([0,61],[47.2,47.2], color=color) ax.plot([30.5,30.5],[47.2,0], color=color) ax.plot([30.5,30.5],[86.8,134], color=color) return ax
and the plot a network of a sequence of strokes:
plt.figure(figsize=(5,10)) pos_bad = pos_bad ax=bad_courtV() color='black' nx.draw_networkx(B, pos_bad,with_labels=True,width=0.25,ax=ax,edge_color='gray',arrows=True) plt.tight_layout() plt.show()
The results is a figure like that: One figure is in left bottom corner and the other in the right upper corner
Of course, I would like that the network is drawn on the pitch. Can anyone help me with how to do it? Thanks in advance

How to plot my networkx graph the way I want?
I'm trying to create a chessboard with networkx and plot it, so my code looks like this:
import networkx as nx import matplotlib.pyplot as plt from string import ascii_lowercase G = nx.Graph() letters = ascii_lowercase[:8] for index, letter in enumerate(letters): for number in range(1,9): if number<8: G.add_edge(f"{letter}{number}", f"{letter}{number+1}") if number>1: G.add_edge(f"{letter}{number}", f"{letter}{number1}") if index<7: if number<8: G.add_edge(f"{letter}{number}", f"{letters[index+1]}{number+1}") if number>1: G.add_edge(f"{letter}{number}", f"{letters[index+1]}{number1}") G.add_edge(f"{letter}{number}", f"{letters[index+1]}{number}") if index>0: if number<8: G.add_edge(f"{letter}{number}", f"{letters[index1]}{number+1}") if number>1: G.add_edge(f"{letter}{number}", f"{letters[index1]}{number1}") G.add_edge(f"{letter}{number}", f"{letters[index1]}{number}") nx.draw(G) plt.show()
which is quite nice, but I'm trying to make it look more like a chessboard so is there any way I can force the plot to fix a1 node on the bottom left or something to make it look like a square(always)?
Thanks

How to colour the node with the highest centrality differently from all other nodes?
I have this data frame:
Entrez Gene Interactor A Entrez Gene Interactor B 0 6840 7431 1 6640 5217 2 823 7431
I wrote this code to generate a dash network:
dash_elements = [] for index,i in final_net.iterrows(): dict1 = {} dict2 = {} dict1['data'] = dict2 dict2['id'] = str(i[0]) dict2['label'] = str(i[0]) dict3 = {} dict1['data'] = dict3 dict3['id'] = str(i[1]) dict3['label'] = str(i[1]) final_dict2 = {} final_dict3 = {} final_dict2['data'] = dict2 final_dict3['data'] = dict3 dash_elements.append(final_dict2) dash_elements.append(final_dict3) dict4 = {} final_dict4 = {} final_dict4['data'] = dict4 dict4['source'] = str(i[0]) dict4['target'] = str(i[1]) dash_elements.append(final_dict4) print(dash_elements) import dash import dash_core_components as dcc from dash import html import dash_cytoscape as cyto from dash.dependencies import Input, Output import plotly.express as px app = dash.Dash(__name__) app.layout = html.Div([ html.P("Dash Cytoscape:"), cyto.Cytoscape( id='cytoscape', elements = dash_elements, layout={'name': 'breadthfirst'}, style={'width': '1000px', 'height': '1000px'} ) ]) if __name__ == '__main__': app.run_server()
The output is:
I want to highlight node 7431, as it is the node with the highest centrality (i.e. the most connections to any other node). I want to colour this node differently (e.g. red), but not manually as this is test data (so I want to calculate every nodes centrality and then colour the node with the highest centrality, instead of manually colouring node 7431).
I identified that node 7431 was the node with the max degree like this:
import networkx as nx G = nx.from_pandas_edgelist(final_net,'Entrez Gene Interactor A','Entrez Gene Interactor B',['Entrez Gene Interactor A', 'Entrez Gene Interactor B']) degree = G.degree degrees = [val for (node, val) in sorted(G.degree(), key=lambda pair: pair[0])] nodes = [node for (node, val) in sorted(G.degree(), key=lambda pair: pair[0])] maxIndexList = [i for i,j in enumerate(degrees) if j==max(degrees)] max_nodes = [nodes[i] for i in maxIndexList] print(max_nodes)
And now all I have to do I guess is add a line somewhere to the dash code saying 'if node in max_nodes, colour with X colour, else, colour with Y colour'.
I can find how to add a general stylesheet here, but I can't see how to turn that into an if/else loop. Could someone show me how to do that (or any other code for colouring nodes differently?)