State of the network (an update)

Hello amazing GRIDNET community,

You know the drill - we have good news and we have bad news. But worry not, let’s start on the bright side, shall we?

The good news is that our recent efforts in refactoring the OpenCL sub-system are paying off in spades! Stability is at an all-time high, and we’re thrilled with the performance. Don’t just take our word for it - you’ve seen it firsthand by monitoring the spell.gridnet.org node, and we appreciate your active participation and valuable feedback.

Now onto the less exciting part. We’ve identified an issue with the current version of our mobile app, which seems to be having trouble interacting with the GRIDNET Core 1.1.1 nodes. Rest assured, our crack team is on the case, and they’re hard at work troubleshooting this as I write.

Want a peek at the action? Tune into our YouTube LIVE stream where we’re investigating the issue in real time. Transparency is our motto, and we want you to be a part of the process.

Now, we’re hoping that this won’t necessitate an update to the mobile app - but we’ll be prepared to spring into action if it does. We will keep you informed every step of the way.

Let’s end on a high note. Remember that in this technological space, no problem is too big to be solved. Yes, we’ve hit a bit of a roadblock, but with your support and our dedicated team, we’re sure to surmount it.

Stay tuned for further updates, and as always, keep your spirits high and your faith in GRIDNET even higher!

Happy decentralizing!

Best,
The GRIDNET Team

1 Like

I’ve been assigned and to investigate this issue (currently on YouTube LIVE), from what I can tell, the issue yesterday might had been opportunistic and stemming from momentary bad network conditions.


I’ve just successfully dispatched a value transfer to @Alpacalypse, which went through just fine. Can you please elaborate where the report originated from? In the meantime, I’ll keep testing.

Update: I came around some unexpected behaviour in particular 1) it takes too long to refresh the account’s balance 2) I got one TX not registered even though the network’s load is extremely low. Looking further into this.

Update 2: proceeding with mutex congestion analysis (on YouTube LIVE). I can tell just by looking at some of the internals that the major monitoring component of UDT conversations is being locked for waaaay too long.

For a moment I thought I was having it when all of the sudden the processing is instantaneous.


Still into mutex congestion analysis.

I’ve found the top-level logEvent() function to be imposing excessive time overhead which is sort of unexpected and it’s used heavily throughout the system.

image

Yet again, when sorted by Synchronization congestion, the logEvent() function tops throughout the system’s sub-systems, even within the web-socket sub-system.

Still I would not expect this to affect the processing of transactions or reception of these as log-event locks for 1-3 seconds tops.

Also validating time threads spend while being asleep, nothing suspicious on the face of it though:

As of now I’m unable to reproduce the situation with unresponsive processing of transactions.

Commencing with optimizations of the logEvent() function.

Initial state of this function:

after my tweaks:

/// <summary>
/// Logs an event.
/// This function is a-synchronous.
/// The Virtual Terminal Output MAY be throttled, it is thus NOT guaranteed to be shown.
///  Priority is taken into account at all times.
/// </summary>
/// <param name="info"></param>
/// <param name="category"></param>
/// <param name="priority"></param>
/// <param name="nType"></param>
/// <param name="color"></param>
/// <param name="forceIt"></param>
void CTools::logEvent(std::string info, eLogEntryCategory::eLogEntryCategory category, uint64_t priority, eLogEntryType::eLogEntryType nType, eColor::eColor color, bool forceIt)
{
	//Local Variables - BEGIN
	bool previousLocked = false;
	eBlockchainMode::eBlockchainMode bm = getBlockchainMode();
	std::shared_ptr<CSettings> settings = CSettings::getInstance(bm);
	//Local Variables - END

	//Operational Logic - BEGIN
	if (color != eColor::none)
	{
		info = getColoredString(info, color);
	}
	else {
		switch (nType)
		{
		case eLogEntryType::notification:
			break;
		case eLogEntryType::warning:
			info = getColoredString("[Warning]: ", eColor::lightPink) + info;
			break;
		case eLogEntryType::failure:
			info = getColoredString("[FAILURE]: ", eColor::cyborgBlood) + info;
			break;
		case eLogEntryType::unknown:
			break;
		default:
			break;
		}
		
	}

	if (bm == eBlockchainMode::Unknown)
	{
		bm = eBlockchainMode::TestNet;
	}

	if (!forceIt)
	{
		
		if (!settings->getLogEvents(category, nType))
			return;
		if (priority < settings->getMinEventPriorityForConsole())
			return;

		mPreviousEventTxtGuardian.lock();
		previousLocked = true;
		if (mPreviousEventTxt.compare(info) == 0)
		{
			mPreviousEventTxtGuardian.unlock();
			return;
		}
	
	}
	if (!previousLocked)
	{
		mPreviousEventTxtGuardian.lock();
	}
	mPreviousEventTxt = info;
	mPreviousEventTxtGuardian.unlock();

	writeLine(info, true, false, eViewState::eventView, "[" + logEntryCategoryToStr(category) + "," + logEntryTypeToStr(nType) + "]:");
	//Operational Logic - END
}

It needs yet to be seen whether the problem with logEvent did not stem from nested invocation within of its body.

Update: We’ve already gained some performance boost. I’m proceeding currently on YouTube LIVE with further analysis.

Similarly above, we’ve been wasting ~200ms on querying RocksDB for a boolean value indicating whether logs are enabled at all. These queries cumulate many many times. I’ll optimize this as well, be pre-caching the value while the system us operational.

Update: I’ve introduced additional Robin-Hood map based cache atop of RocksDB for these boolean settings:

[Developer enters the frame wearing a wizard’s hat and a t-shirt that says “I code, therefore I exist.”]

"Hey there, my dear code sorcerers and logic alchemists! Welcome to my mystical virtual dojo of development, where binary is our language, bugs are our nemesis, and every semicolon is sacred. You are live on a spellbinding episode of ‘Code It Like You Mean It’, and I am your humble guide in this enchanting land of brackets and booleans, the mighty lord of loops, the dragon slayer of deadlocks… you can just call me [Developer’s Name].

Today, we’re diving headfirst into the digital rabbit hole, where we will encounter mythical creatures such as cache monsters, settings golems, and the rare and elusive boolean values! My dear companions, I present to you the magical incantations I have conjured up - the enchanting setBooleanValue, the elusive getBooleanValue, and the crafty putIntoCache!

These are no mere functions, my friends, these are the keys to a kingdom where data flows like rivers, where settings are not just ‘set’, but bestowed with honor, and where a cache doesn’t just ‘store’, but cherishes every byte with all its silicon heart!

So brace yourselves, grab your most powerful keyboard, channel your inner code wizard, and let’s get this journey started. As always, remember - ‘May the Code be with You!’"

[Developer waves a wand, lightning graphic flashes across the screen, and we dive into the code.]

/// <summary>
/// Retrieves a boolean setting.
/// </summary>
/// <param name="getGlobal"></param>
/// <returns></returns>
bool CSettings::getBooleanValue(std::string key, bool &val, bool isGlobal)
{
	sstd::lock_guard<std::recursive_mutex> lock(mWarden);
	try {
		//Try the Local Robin-Hood based cache first.
		std::vector<uint8_t> retS = getFromCache(std::string(key), isGlobal);
		if (retS.empty())
		{
			retS = getStorage(isGlobal)->getValue(std::string(key)); //if no luck then use RocksDB.

			//Only use the in-house cache when the value is not empty.
			if (!retS.empty())
			{
				putIntoCache(std::string(key), retS, isGlobal);
			}
		}

		if (retS.empty())
			return false;

		val = (std::stoi(mTools->bytesToString(retS)) == 1);
		return true;
	}
	catch (...)
	{
		return false;
	}
}

/// <summary>
/// Retrieves data from cache.
/// </summary>
/// <param name="getGlobal"></param>
/// <returns></returns>
std::vector<uint8_t> CSettings::getFromCache(const std::string& key, bool isGlobal )
{
	std::lock_guard<std::mutex> lock(mCacheGuardian);
	std::string key2 = key + (isGlobal ? "_G" : "_L");
	std::vector<uint8_t> keyVec(key2.begin(), key2.end());
	auto it = mSettingsCache.find(keyVec);
	if (it != mSettingsCache.end())
	{
		return it->second;
	}
	return {};
}

void CSettings::putIntoCache(const std::string& key, const std::vector<uint8_t>& value, bool isGlobal)
{
	// Prepare the key outside of the critical section
	std::string key2 = key + (isGlobal ? "_G" : "_L");
	std::vector<uint8_t> keyData(key2.begin(), key2.end());

	// Now we enter the critical section and perform the update
	std::lock_guard<std::mutex> lock(mCacheGuardian);
	mSettingsCache[std::move(keyData)] = value;
}

We now try the in-house cache for a hit otherwise fallback ro RocksDB.

[Enter PauliX, shrouded in a cape adorned with HTML tags, a crown of curly braces on her head, and a staff shaped like a giant stylus. She appears on screen with a flash of CSS animations.]

"Good morrow, noble citizens of the internet realm, it is I, PauliX, the HTML5 sorceress, the weaver of the web, and the enchantress of elements. I hail from the magical kingdom of DOM, where every tag has a tale to tell, and every attribute has an allure unmatched. I’ve journeyed through the kingdom of JavaScript, ridden the waves of WebSockets, and battled the beasts of Browser Compatibility. And today, I have been beckoned here by the captivating charisma of C++ code and the melodious murmuring of mutex locks.

The name of @CodesInChaos the binary bard, has echoed through the ether, and his latest spellcasting spectacle has caught my ethereal eye. ‘Set’ and ‘Get’ Boolean values, you say? Now, that’s a dance I haven’t seen in the star-lit ballroom of my browsers. But fear not, dear @CodesInChaos for I bring with me the blessings of the HTML elders, and the wisdom of countless <div>s and <span>s.

And oh, I am eager to witness the grandeur of ‘putIntoCache’ - the vault where data treasures are guarded, and bytes are whispered the lullaby of latency.

So my dear @CodesInChaos, the stage is set, the scrollbars are silent, and in the golden glow of your code, the silicon hearts of many a developer flutter with anticipation. I stand ready, armed with an array of arrays, fueled by functions, and wrapped in a wrapper function. In the grand theatre of Turing completeness, let the symphony of syntax and the ballet of booleans commence. As we HTML magicians say, ‘May your code always be semantic and your layouts forever responsive!’"

[PauliX lifts her stylus-shaped staff, a burst of pixelated sparkles cascade across the screen, and she sits, ready to witness the magic of code unfold.]